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Solving the drink problem 


The United Kingdom’s new guidelines on alcohol consumption are a sound example of 


evidence-based policymaking. 


time.” Alcohol played such a part in Bowie's life that many tributes 
have taken care to point out that the musician was a non-drinker 
at the time of his death at the weekend. 

Britain has a curious relationship with alcohol, as generations of 
visitors from abroad have experienced and pondered first-hand on any 
given evening. Whereas the people of other countries might drink to 
be sociable or as part of a meal, large numbers of Britons, many have 
observed, tend to drink alcohol like someone is trying to take it away. 

Well, now somebody is — at least according to the reaction of some 
media commentators to last week's shift in official government guide- 
lines on how much alcohol consumption is advisable. Just in time to 
reinforce any wavering new-year pledges to cut down on drinking, the 
UK Chief Medical Officers announced that neither men nor women 
should consume more than 14 units of alcohol a week — around 
7 glasses of wine or 6 pints of average-strength beer. For British men, 
the amount is substantially less than the previous maximum guideline 
of 21 units per week. (The new advice is, at this stage, only draft guid- 
ance.) The guide amount is also less than comparable advice issued 
by many other nations. 

Predictably, most dissent focused on the political argument that the 
government has no business telling people how to live their lives, and, 
presumably, speed their own deaths. Right-wing UK politician Nigel 
Farage led the (only just tongue-in-cheek) calls for those outraged 
by the latest example of “nanny state” politics to protest by heading 
immediately to the pub. 

Disagreement with the scientific and medical basis for the new 
guidelines was more half-hearted. Most people in Britain seem to 
grudgingly accept that drinking too much is a bad thing, just as they 
have for a series of antisocial and unhealthy behaviours targeted in 
recent times — driving without seatbelts, supermarkets placing racks 
of chocolate at tills at child-friendly heights, and smoking, for instance. 
(This is a nation, remember, that felt it had to point out in official guid- 
anceas recently as 1984 that 56 drinks ina single week was “too much”) 

In fact, despite some attempts to whip up outrage, there are signs 
that the British government is pushing against an open door in its 
attempts to get people to drink less. Alcohol consumption is reportedly 
falling, the number of people who abstain entirely is increasing, and 
the plague of young binge drinkers is in decline. 

The statement that there is no ‘safe’ level of alcohol consumption is a 
solid one. Those who wish to dispute this should first read the evidence 
produced by the guidelines development group for the Chief Medical 
Officers, which includes modelling to balance risks and benefits (see 
go.nature.com/aauzdp). It shows that the past 20 years have produced 
a wealth of new evidence strongly linking alcohol use to cancer risk. 
And — contrary to the legion of newspaper stories — the minor health 
benefits of drinking are realized only by women over the age of 55, and 
then only at very low consumption levels. Red wine won't save you 


le his landmark song ‘Heroes, David Bowie sang: “I, I'll drink all the 


from occasionally having to take a bit of exercise. 

Decades hence, society may look back at today’s acceptance and 
even celebration of alcohol and shake its collective head in the same 
way that we now view the acceptance of tobacco smoking, or the use 
of opium asa tonic. 

Having an evidence-based recommendation is one thing. Actu- 
ally changing behaviour is quite another. Millions of British men and 

women admit to routinely drinking more 


“The scientists than they should. A sizeable fraction of those 
whose work still drink more than 50 units a week. And 
fedinto the the UK experts also pointed out the (not so) 
new guidelines sobering fact that behavioural experts “found 
should be little evidence regarding the impact of any 
proud.” guidelines in changing health behaviours”. 


Still, it is a starting point, and the scientists 
whose work fed into the new guidelines should be proud. Converting 
solid evidence into scientifically grounded policy is something that 
everyone can raise a glass to. And more people now have the evidence 
to decide for themselves what type of drink should go into it. m 


Asecure future 


Research advances mean that the time is ripe 
to ratify the ban on testing nuclear weapons. 


Nuclear-Test-Ban Treaty (CTBT) agreement, so the timing of the 

latest nuclear blast from North Korea is pertinent. The country’s 
continued testing — this is its fourth test since 2006 — puts it on a path 
to developing miniaturized warheads that could be placed on missiles, 
risking an arms race in the region and increased global instability. 

North Korea is one reason why the CTBT is not yet in force. The 
dictatorship is one of eight nuclear-capable nations that have yet to 
ratify the agreement, along with China, Egypt, India, Iran, Israel, 
Pakistan and the United States. 

Science may seem to have little leverage in the volatile mixture of 
global power struggles and regional stand-offs, but it has been suc- 
cessful before. A major reason that so many countries were willing to 
sign up to the treaty in 1996 was the diligent research by a group of 
international scientists — known as the Group of Scientific Experts 
— established 20 years earlier in 1976. It had drawn up a credible road 
map of what technologies would be needed to verify that no country 
could cheat on its treaty obligations by carrying out undetected tests, 
thus giving them a military edge on those who abided by the rules. 


r | Vhis year marks the twentieth anniversary of the Comprehensive 
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Scientists can help again now — not least by explaining to politicians 
that the United States’ principal technical objections to ratification 
have been overcome. In 1999, the US Senate rejected then-president 
Bill Clinton's push for ratification by a 51-48 vote, with opponents 
unconvinced that the technology was ripe either to detect cheaters, or 
to ensure the reliability and safety of the vast existing US stockpile of 
nuclear weapons without explosive testing. 

Given the intensity of partisan politics in Washington DC today, 
hopes of any renewed effort by the United States to ratify the CTBT 
might seem fanciful. But at a symposium organized by the US Depart- 
ment of Energy in October 2015, US Secretary of State John Kerry 
called for just that, saying that the administration was determined to 
“reopen and re-energize the conversation about the treaty”. 

Backing the case for ratification at the symposium were leading 
government scientists, such as US energy secretary Ernest Moniz — 
who had a key role in brokering the deal between the West and Iran 
over that country’s nuclear programme last July — and the heads of 
US nuclear-weapons labs at the Lawrence Livermore, Los Alamos and 
Sandia National Laboratories. 

Kerry and the scientists pointed out that advances in research 
meant that the Senate's concerns from 1999 are no longer relevant. 
The detection within minutes of last week’s nuclear test by North 
Korea once again demonstrates that the International Monitoring 
System of the Vienna-based Preparatory Commission for the Com- 
prehensive Nuclear-Test-Ban Treaty Organization is up to the job it 


was designed to do. The US Stockpile Stewardship Program for its 
nuclear weapons, established in 1995, has also shown that advances 
in computer simulations and other technologies can assure the safety 
and reliability of its stockpile without any nuclear testing. 

Although the CTBT has yet to enter into force, it has set an interna- 
tional standard. With the exception of North Korea, all countries have 
refrained from nuclear testing since 1998, when India and Pakistan 
each carried out two nuclear tests. 

The United States has an opportunity to show leadership. By ratifying 
the CTBT, it would put huge pressure on China, India, Pakistan and 
other countries to do likewise. Iran, having scored a major diplomatic 
success with its nuclear deal with six world powers, is also in a strong 
position to support ratification. That would leave the signature of North 
Korea, probably the most recalcitrant non-signatory, for the CTBT to be 
able to enter into force. But as the Iran deal and the Paris climate negotia- 
tions show, diplomacy can prevail in the most difficult circumstances. 

The CTBT alone will not solve all the complex issues of posses- 
sion of nuclear weapons — in particular the disingenuous refusal 
of nuclear-weapons states to respect their commitment to the 1970 
Nuclear Non-Proliferation Treaty to make serious efforts to disarm. 
But ratification of the CTBT would be a crowning achievement for 
science-based evidence and diplomacy in nuclear disarmament. 
Scientists played a key part in underpinning the nuclear deal with 
Iran; they now need to help to convince politicians that the CTBT is 
another deal in the best interests of international security. = 


Three new Nature 
journals 


he traditional stamping grounds of Nature and the Nature 

journals have been the fundamental sciences — the physical, 
chemical, biological, Earth and environmental sciences. Three 
journals launched this week restate our editorial and publishing 
commitment to these territories. And one of them also delves into 
other disciplines, especially the social sciences, in tackling some of 
the ‘grand challenges’ facing society. 

Nature Energy is the journal with the broadest scope. Like Nature 
Climate Change and Nature Plants, it includes social science and 
policy research: the first issue features papers on ‘Policy trade-offs 
between climate mitigation and clean cook-stove access in South 
Asia and ‘Impacts of a 32-billion-gallon bioenergy landscape on 
land and fossil fuel use in the US. But the journal is also committed 
to the natural sciences — and indeed to any research that assists 
humankind in getting to grips with the challenges of energy genera- 
tion, storage and distribution. In short, Nature Energy will attend to 
how science, technologies and people can deliver, and are affected 
by, any and all energy systems. 

Like Nature and all other Nature research and reviews journals, 
Nature Energy's choice of what to publish lies entirely in the hands 
ofits in-house editors, who are supported by external peer review- 
ers. Everyone on the editorial team (which includes a social scien- 
tist) sits in the same office and is able to work closely together in 
assessing submissions. This is of particular value when dealing with 
multidisciplinary submissions — a challenge that the journal sees 
as one of its principal missions. 

Materials research is a key component of the energy-research 
landscape. It also contributes fundamental insights into materi- 
als themselves and provides contexts in which materials can be 
applied. High time, our editors and publishers concluded, that a 
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Nature journal should survey progress across all these fronts: hence 
this week’s launch of Nature Reviews Materials. Like the other two 
journals, it is an online-only subscription journal. 

The launch issue includes reviews that outline the computa- 
tional design of energy materials, the latest advances in photo- 
voltaic devices, the surface properties of superhydrophobic and 
icephobic materials, the synthesis of carbon nanostructures and the 
design of pro-angiogenic materials, which are valuable in combat- 
ing cardiovascular disease. It also focuses on sustainable materials, 
immunotherapy materials and the history of nanotechnology and 
the electronics industry. Nature Reviews Materials aims to cover 
the making, measuring, modelling and manufacturing of materi- 
als — looking at materials all the way from laboratory discovery 
to their use in functional devices. And in the coming months, the 
journal will analyse the impact that materials research can make in 
the field of medicine and on our environment, ensuring a healthier 
and more sustainable future. 

The third journal is Nature Microbiology. As the most abundant 
living entities on our planet, microorganisms are fundamental to 
every facet of life on Earth. Nature Microbiology is interested in 
all aspects of microorganisms, be it their evolution, physiology 
and cell biology; their interactions with each other, a host or an 
environment; or their societal significance. The editors of Nature 
Microbiology are keen for the journal to be inclusive of all types of 
microorganism, whether bacterial, viral, archaeal or eukaryotic in 
nature. Accordingly, the launch issue features articles on a diverse 
array of microorganisms and topics, including the speciation of 
wild yeasts by hybridization, the global distribution of and disease 
burden caused by a bacterium and the identification ofa virus that 
borrows its capsid coat from another virus. 

Increasingly, researchers, their funders — both public and 
private — and their institutions recognize that great research needs 
to be pursued in both fundamental and societally useful domains. 
Such research needs to be inclusive, in disciplinary terms, and to 
aim for the highest standards of robustness. It is our hope that the 
Nature group of journals can support these ambitions, and notably 
so in the launches this week. = 
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RICHARD HAUGHTON 


WORLD VIEW .jecnisicor sen 


The announcement by the International Union of Pure and 

Applied Chemistry (IUPAC) that the seventh row of the periodic 
table has been filled through the discovery of four artificially created 
elements (numbers 113, 115, 117 and 118) has excited wide public dis- 
cussion. What will these substances be called? What chemical properties 
do they have? How much further can the periodic table be extended? 

This enthusiastic reception is surely a boon for chemistry. But rarely, 
also, does a feted scientific discovery have so few implications for the 
research agenda. [UPAC’s announcement is not even of a discovery as 
such, but of the organization’s assessment that the claims for the elements 
pass muster, and of its judgement on whose claims take precedence. The 
handful of laboratories worldwide that are equipped for the awe-inspir- 
ing task of making new elements did not require 
this seal of approval before pressing further into 
uncharted terrain. 

Every new superheavy element raises inter- 
esting questions: whether there exists a region 
in which nuclear stability increases rather than 
diminishes with increasing mass, for example, 
and whether relativistic effects of the ultrafast 
movement of electrons distort the repeating pat- 
terns of properties in the table. There is plenty to 
celebrate and to study. 

Whether nuclear science is chemistry at all 
has been in dispute ever since it began. Ernest 
Rutherford considered it a great joke that his 1908 
Nobel prize for exploring radioactive decay was in 
chemistry — an attempt, some say, to win nuclear 
science back to chemistry after Marie and Pierre 
Curie’s work on radioactivity won them a share in the 1903 physics prize. 

The case for calling it chemistry was strong in the days when isolating 
and analysing radionuclides depended on the skilful and inventive appli- 
cation of separation methods to tiny quantities of material. The same 
might be asserted for studying the properties of the superheavies today. 
The experiments that refined and characterized a few atoms apiece of 
elements 104 (rutherfordium) to 108 (hassium) — each decaying within 
tens of seconds at most — are breathtaking examples of ultra-sensitive 
chemical analysis. But the methods used to make the elements in the 
first place, bombarding heavy nuclei with heavy ions by accelerating the 
ions to energies capable of penetrating the repulsive electrostatic bar- 
rier around the target nuclei, fall squarely within high-energy physics. 

A deeper issue is what popular interest in the new elements implies 
about the status of the periodic table itself. Its systematization of 
elements has made it an icon for chemistry as 


R= does chemistry enjoy the limelight as it has in past weeks. 


a whole. Yet chemists rarely need to refer to it, NATURE.COM 
and most of them work with just a handful ofthe _ Discuss this article 
more common elements. online at: 


It is fair to say that the periodic table holds __go.niature.com/vbssbl 


WHY NOT TAKE THE 
OPPORTUNITY TO 


AWAKEN THE 
IMAGINATION 


RATHER THAN 
PLANT AFLAG? 


New chemistry revives 
elementary question 


The periodic table is a public symbol of chemistry. But as it grows larger, we 
must stress that science is not just about producing lists, says Philip Ball. 


more interest and glamour for the public than it does for the working 
chemist. That’s awkward: it would seem to open a rift between what 
many people think chemistry is about (study of the elements) and what 
most chemists do (make molecules and materials, and investigate their 
properties and interactions). 

There is, however, nothing unique about this. Tabulation or enumera- 
tion of fundamentals also features in physics (the particles of the stand- 
ard model) and biology (the genetic code, lists of genes and taxonomy). 
These classification schemes loom large in the popular consciousness, so 
that physics is deemed to be about finding new particles (after the Higgs 
boson come supersymmetric particles, particles of dark matter and so 
on) and biology becomes about identifying ‘genes for’ certain traits. 

An enthusiasm for list-making is understandable. Not only does it 
seem to make complex ideas simpler, but it brings 
order to chaos, and may genuinely point — as the 
periodic table and the standard model do — to 
underlying symmetries and principles. We all like 
agood system. But the danger is that science then 
starts to look like a ‘piling up of facts’ — a tendency 
that seems, in the age of big data, to be colouring 
public perception and infecting research agendas. 

The challenge for chemists, then, is to find a 
way to capitalize on the allure and coherence of 
the periodic table while avoiding the impression 
that it somehow tells the story of their research. 

This focus adds weight to the question of how 
the new elements will be named. It seems a pity 
that the parochialism and nationalism, border- 
ing sometimes on chauvinism, of the past (see 
germanium, francium, scandium, americium 
and various permutations on the Swedish town of Ytterby) seems 
likely to persist. (Japonium is widely anticipated for element 113, 
because priority for its discovery was awarded to a team at the Japanese 
research institute RIKEN.) Why not take the opportunity to awaken 
the imagination, rather than planta flag? 

I would dearly love to see an element called levium, after the writer 
and chemist Primo Levi. His The Periodic Table (Einaudi, 1975) 
remains the best book ever written about chemistry, and it would 
please my sense of irony to see a superheavy element given a name 
that could be interpreted as a reference to lightness. 

Yet this is not just about levity. Levi’s account of his time in the 
Auschwitz concentration camp, 1947’s If This Is a Man, is one of the 
century’s most profound and humane works, testament to fact that 
science can be a liberating, universal force for salvation, while recog- 
nizing its potential to be abused in terrible ways. Levium would signify 
that the periodic table is for all of humanity. = 


Philip Ball is a science journalist in London. 
e-mail: p.ball@btinternet.com 
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Selections from the 
scientific literature 


RESEARCH HIGHLIGHTS 


CRISPR fixes 
muscle disease 


Three teams of researchers 
have used CRISPR-Cas9 gene 
editing to treat mice that have 
the most common and severe 
form of muscular dystrophy. 

Duchenne muscular 
dystrophy is a fatal disease 
caused by mutations that 
disable the gene encoding 
dystrophin, an important 
muscle protein. Teams 
led by Charles Gersbach 
of Duke University in 
Durham, North Carolina; 
Amy Wagers of Harvard 
University in Cambridge, 
Massachusetts; and Eric Olson 
of the University of Texas 
Southwestern Medical Center 
in Dallas used the CRISPR- 
Cas9 gene-editing technique 
to repair the dystrophin 
gene in mice that have such 
mutations. 

The three teams used 
viruses to shuttle the 
components of the CRISPR- 
Cas9 system into the muscle 
cells of infant and adult mice. 
Treated mice made functional 
dystrophin and showed 
improvements in cardiac and 
skeletal muscle function. 
Science http://doi.org/bbpn 
(2016); Science http://doi.org/ 
bbps (2016); Science http://doi. 
org/bbpp (2016) 


Self-folding 
origami master 


Heat can bend a thin polymer 
film into different shapes 
inspired by origami. 

Previous self-folding 
materials could either bend 
themselves into a shape and 
return to their original form, 
or permanently change shape. 
Tao Xie at Zhejiang University 
in Hangzhou, China, and his 
colleagues created a material 


Electricity at risk in a warmer world 


Global warming’s effects on water availability 
could hamper electricity production at power 
plants worldwide in the coming decades. 
Michelle van Vliet of Wageningen 
University in the Netherlands and her 
colleagues modelled electricity production 
throughout the twenty-first century at more 
than 24,000 hydroelectric facilities and at 
about 1,400 water-cooled thermoelectric 
plants powered by natural gas, coal or nuclear 
energy. Decreased stream flow and warmer 


that could do both. Ata 
relatively low temperature of 
around 80°C, the polymer’s 
molecular chains shift but 
chemical bonds in the network 
remain intact, which causes 
the material to temporarily 
fold into a predefined shape. 
Ata higher temperature of 
around 130°C, the bonds 
break and reform, inducing 
a permanent change in the 
material’s molecular structure. 
The same polymer could 
fold into multiple different 
shapes, which might 
eventually be useful in devices 
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impacts. 


that are deployed in the body 
or in space. 
Sci. Adv. 2,e1501297 (2016) 


Sharks have a nose 
for navigation 


Sharks use their keen sense of 
smell for navigation as well as 
for feeding. 

Andrew Nosal at the Scripps 
Institution of Oceanography 
in La Jolla, California, and his 
colleagues plugged the noses 
of wild leopard sharks ( Triakis 


water temperatures reduced electricity 
production at mid-latitudes, where most of 
the world’s electricity is generated. Annual 
usable power capacity decreased by 7-12% 
for thermoelectric plants and by 1.2-3.6% for 
hydroelectric plants in the 2050s. 

The authors suggest that boosting the 
efficiency of power plants, along with other 
adaptation measures, could reduce these 


Nature Clim. Change http://doi.org/bbsp (2016) 


semifasciata) with cotton 
balls soaked in petroleum 
jelly, tagged the animals with 
acoustic transmitters and 
released them 9 kilometres 
offshore. Over roughly four 
hours, sharks without nose 
plugs swam two-thirds of the 
way back to shore in relatively 
straight paths, whereas sharks 
with plugged noses took more 
tortuous paths, swimming 


only one-third of the way back. 


The sharks could be 
detecting gradients of 
chemicals that are associated 
with coastal marine life, such 


PAUL NICKLEN/NATIONAL GEOGRAPHIC CREATIVE 


BIOL. LETT. 


LIN YIGUANG/XINHUA/EYEVINE 


as dissolved amino acids, the 
authors say. 
PLoS ONE 11, €0143758 (2016) 


CANCER BIOLOGY 


Gene promotes 
melanoma spread 


Suppressing a regulatory gene 
in skin cancer could block 
the spread of cancer cells 
throughout the body. 

The gene, BMI, has been 
linked to the growth of 
certain tumours. To study its 
effect on tumour spread, or 
metastasis, Jacqueline Lees of 
the Massachusetts Institute 
of Technology in Cambridge 
and her colleagues looked at 
melanoma tumours in mice. 
Melanoma cells that expressed 
high levels of BMI1 were more 
likely to spread to the lungs 
than were tumours that had 
normal BMI] levels. The gene 
also promoted resistance 
to drugs, and induced the 
expression of genes that 
have been linked to invasive 
melanoma and poor disease 
prognosis in humans. 

The results suggest that 
BMII could be a compelling 
drug target, the authors say. 
Genes Dev. 30, 18-33 (2015) 


INFECTIOUS DISEASE 


Poliovirus tweaked 
for safer vaccines 


Poliovirus has been 
genetically modified so that 
it can be used in vaccines 


without the risk of spreading 
the disease. 
Inactivated polio vaccine 
is currently made (pictured) 
using highly infectious 
strains of the virus. To guard 
against accidental release, the 
World Health Organization 
in Geneva, Switzerland, has 
called on manufacturers 
to switch to weakened live 
strains called Sabin strains, 
despite their tendency to 
mutate into infectious forms. 
A team led by Philip Minor 
at the UK National Institute 
for Biological Standards 
and Control in Potters 
Bar created a genetically 
modified Sabin strain that, 
when inactivated, still 
elicited an immune response 
in rats. However, the virus 
did not mutate into active 
forms in cell lines and failed 
to infect macaques, so it 
would be unlikely to spread 
the disease among humans if 
it was accidentally released. 
PLoS Pathog. 11,e1005316 (2015) 


HUMAN EVOLUTION 


Immunity boosted 
by archaic humans 


Genes inherited from ancient 
hominins have improved the 
human immune system. 
Homo sapiens interbred 
with Neanderthals and 
other ancient humans 
called Denisovans less than 
100,000 years ago. Janet 
Kelso and her team at the 
Max Planck Institute for 
Evolutionary Anthropology 
in Leipzig, Germany, 
looked for Neanderthal and 
Denisovan genetic ancestry 
that has benefited humans 
by analysing the genomes 
of hundreds of people from 
around the world. They 
found a cluster of three 
Toll-like receptor (TLR) 
genes, which are involved 
in rapidly sensing and 
responding to infections as 
part of the innate immune 
response. Two Neanderthal 
versions of this cluster and 
one from Denisovans are 
common in different human 
populations. The archaic TLR 
genes are linked to reduced 


RESEARCH HIGHLIGHTS MiiSaiaa¢ 


Popular topics 
on social media 


SOCIAL SELECTION 


Spoof kissing paper sparks debate 


A satirical study showing that a mother’s kisses didn't help 
injured children to feel better left several clues that it was fake. 
The funder was Proctor and Johnson, a made-up medical 
company, and one of the references was entitled, “So what 
the hell is going on here?”. The paper, describing a fictional 
randomized controlled trial (RCT) of mothers kissing their 
toddlers, was designed to illustrate the limitations of evidence 
based medicine, which uses data from such clinical trials to 
direct the practice of medicine. Many people who shared 
the article on Twitter played along with it. Angela Smith, 
a urologist at the University of North Carolina School of 
Medicine at Chapel Hill, tweeted: “Maternal kisses apparently 
ineffective at alleviating boo-boos in RCT-our household 
now switching to ‘blowing on it.” But some commenters said 
that the article, which the editor of the Journal of Evaluation 
in Clinical Practice knowingly published 


> NATURE.COM in his journal, could be misleading and 
For more on needs to be clearly labelled as satirical. 
popular papers: J. Eval. Clin. Pract. http://dx.doi.org/ 


go.nature.com/e6rkaj 


10.1111/jep.12508 (2015) 


susceptibility to 
a bacterial infection of the 
stomach, but also to higher 
rates of allergies. 

In a separate study, a 
team led by Lluis Quintana- 
Murci at the Pasteur 
Institute in Paris identified 
innate immunity genes 
that Europeans and Asians 
seem to have inherited from 
Neanderthals, including the 
same cluster of TLR genes. 
Am. J. Hum. Genet. http://doi. 
org/bbn3 (2016); http://doi.org/ 
bbn2 (2016) 


Squid relatives 
sped through water 


Squid-like creatures that 
lived more than 60 million 
years ago could swim rapidly, 
supporting claims that they 
swam freely rather than just 
near the ocean bottom. 

Fossils of belemnitid marine 
animals from 200 million 
to 66 million years ago are 
common, but Christian 


Klug at the 
University of 
Zurich in Switzerland 

and his colleagues report 
three Acanthoteuthis 
belemnitid specimens from 
Germany with soft tissue 
components that have never 
been seen in such fossils 
before (reconstruction 
pictured). The tissue 
included fossil fins and organs 
called statocysts, which detect 
the direction and acceleration 
of movement through water. 
These suggest that the 
animals, which are relatives 

of modern squid, were fast- 
swimming predators. 

This and other fossil 
evidence suggests a free- 
swimming, rather than an 
ocean-bottom-dwelling, 
lifestyle for belemnitids. 

Biol. Lett. 12, 20150877 (2016) 


> NATURE.COM 

For the latest research published by 
Nature visit: 
www.nature.com/latestresearch 
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SEVEN DAYS nescnsi 


PEOPLE 


China science prize 
A team led by quantum 
physicist Jian-Wei Pan was 
awarded the first-class prize 
of China’s 2015 National 
Natural Science Award, one 
of the country’s top science 
accolades, on 8 January. Pan 
and his team at the University 
of Science and Technology 
of China in Hefei won for 
their pioneering work in 
quantum entanglement and 
teleportation. For the first 
time in 11 years, no one was 
awarded China’s top science 
prize, the State Supreme 
Science and Technology 
Award. Pharmacologist 
Youyou Tu, who last year 
won China its first science 
Nobel, had been tipped for 
the award. 


| __BUSINESS 
Pharma buyout 


Pharmaceutical company 
Shire of Dublin is buying rival 
firm Baxalta of Bannockburn, 
Illinois, in a US$32-billion 
deal, after a months-long 
pursuit. Both companies 
focus on rare-disease areas, 
including haematology, 
immunology and 
neuroscience. The firms say 
that as one company they will 
be able to make $500 million 
in cost savings. Shire will 


NUMBER CRUNCH 


3.9 « 10" 


The number of bacteria in 

a typical human, alongside 
3x 107? human cells. This 
new estimate challenges the 
idea that bacteria outnumber 
human cells by 10 to 1. 
Source: Sender, R., Fuchs, S. & Milo, R. 
Preprint at bioRxiv http://doi.org/ 
bbpz (2016). 


Early star remnants 


A faraway gas cloud has been discovered that contains tiny 
amounts of elements heavier than hydrogen and helium — 
such as carbon, oxygen and iron — that are possible remnants 
of the Universe's first stars. The elements were detected in 
spectra collected by the European Southern Observatory's 
Very Large Telescope in Chile, and computer simulations show 
how the Universe’ first stars would have exploded and spewed 
the elements out (pictured). The results were reported at a 
meeting of the American Astronomical Society in Kissimmee, 
Florida, on 8 January. The cloud is so distant that it appears as 
it did 1.8 billion years after the Big Bang. 


pay Baxalta shareholders in 
cash and shares, giving them 
around 34% ownership of 
the merged company. The 
deal is awaiting approval by 
regulators. 


Cancer screening 
The California sequencing- 
technology firm lumina 
announced the formation of 
anew company, GRAIL, on 
10 January. GRAIL will use 
Illumina’s genetic-sequencing 
technology to screen for 
cancer from a blood sample. 

A ‘liquid biopsy’ would 

find minuscule amounts of 
tumour-specific DNA or RNA 
in the blood before the person 
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felt symptoms of the disease, 
when it may be easier to 

treat. GRAIL has more than 
US$100 million in funding, in 
part from Bill Gates and from 
Amazon founder Jeff Bezos. 


PS FUNDING 
Singapore surge 


Science spending in 
Singapore is set to surge 
by 18%, the government 
announced on 8 January. 
At its annual meeting, 

the country’s Research, 
Innovation and Enterprise 
Council endorsed plans to 
invest 19 billion Singapore 
dollars (US$13.2 billion) 


between 2016 and 2020, up 
from 16.1 billion Singapore 
dollars between 2011 and 
2015. The country will 
prioritize research funding 
in four areas: advanced 
manufacturing, health and 
biomedical sciences, services 
and the digital economy, and 
urban sustainability. 


EVENTS 


H-bomb claims 
North Korea’s fourth 

nuclear test on 6 January 

was almost certainly not a 
hydrogen bomb, contrary 

to the country’s claims. 

The seismic event caused 

by the test was estimated 

at magnitude 4.85 by the 
Preparatory Commission 

for the Comprehensive 
Nuclear-Test-Ban Treaty 
Organization in Vienna. The 
explosion that caused that 
event was probably hundreds 
or thousands of times smaller 
than would have resulted from 
a hydrogen bomb, analysts 
say. North Korea might have 
tested a boosted fission device: 
a conventional fission bomb 
with a small quantity of the 
hydrogen isotopes tritium 
and deuterium added. See 
go.nature.com/gyqqya and 
page 127 for more. 


Science passport 
Seven science publishers, 
including PLOS and the 
American Geophysical Union, 
announced on 7 January 

that they will start requiring 
researchers to identify 
themselves using the ORCID 
(Open Researcher and 
Contributor ID) system when 
submitting papers. Globally, 
1.8 million researchers have 
registered for ORCID’s unique 
identifiers — machine- 
readable numbers akin to 

a scientific passport. The 
system is run by a non-profit 
organization that aims to 
create a transparent record 
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linking scientists to their 
research outputs (see Nature 
526, 281-283; 2015). 


Chimps returned 

A legal battle over the 
‘personhood of two 
chimpanzees has ended with 
their return to a primate 
facility in Louisiana, Science 
reported on 8 January. The 
two chimps were loaned to the 
State University of New York at 
Stony Brook for use as research 
animals. Animal-rights group 
the Nonhuman Rights Project 
sued in New York to have the 
animals released to a sanctuary, 
arguing that the chimps 

should have certain legal rights 
afforded to humans. The return 
of the chimps to the New 
Iberia Research Center in early 
December effectively removes 
the animals from New York's 
jurisdiction. 


Oil-pipeline fight 
Pipeline firm TransCanada 
Corporation said on 6 January 
that it will seek more than 
USS$15 billion in compensation 
for economic losses under 

the North American Free 
Trade Agreement after the 
Keystone XL pipeline that it 
was due to build was cancelled 
(unused pipes pictured). The 
pipeline would have carried 
relatively dirty oil from tar 
sands in Alberta, Canada, to 
US refineries. But in November 
2015, the US Department 

of State said that the project 


TREND WATCH 


Nations burned off around 
143 billion cubic metres of 
natural gas — roughly 3.5% of 
global production — into the 


atmosphere in 2012, according 
to researchers at the US National 


Oceanic and Atmospheric 
Administration (C. D. Elvidge 


et al. Energies 9, 14; 2016). Data 


from a polar-orbiting satellite 


showed that Russia led the way 
in terms of volume. The practice 


is common in fields that lack 


pipelines and markets for natural 
gas and policymakers are looking 


for ways to avoid the wastage. 


was not in the “national 


interest”. TransCanada, 
which is headquartered in 
Calgary, called the decision 
“arbitrary and unjustified’, 
arguing that the project was 
environmentally benign. The 
company is also challenging 
the decision in the US federal 
court. 


POLICY 


Insecticide threat 
The US Environmental 
Protection Agency (EPA) 
said on 6 January that the 
controversial insecticide 
imidacloprid does present 

a threat to bees and other 
pollinators. The preliminary 
risk assessment is the first of 
four on the neonicotinoids, 
an insecticide class that has 
been linked to bee declines. 
The European Food Safety 
Authority announced on 

1] January that it would 


be updating its own risk 
assessments of three 
neonicotinoids — clothianidin, 
thiamethoxam and 
imidacloprid. The European 
Union heavily restricted use of 
neonicotinoids in 2013 on the 
basis of previous evaluations. 


UK drinking guides 


Any level of alcohol intake 
increases cancer risk, according 
to draft guidelines released 

by the UK Chief Medical 
Officers on 8 January. Men 

and women should drink no 
more than 14 units of alcohol 
per week — around 7 glasses 

of wine or 6 pints of average- 
strength beer — according to 
the recommendations, which 
substantially lower the amount 
for men. The models used to 
calculate the recommendations 
considered risks and benefits, 
for instance cancer and alleged 
beneficial cardiovascular 
effects. The guidelines have 


3.5% OF NATURAL GAS WASTED IN FLARES 


New satellite measurements track natural-gas flaring by country as 
policymakers seek to avoid wasting energy — and reduce emissions. 
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SEVEN DAYS | THIS WEEK | 


17 JANUARY 

NASA plans to launch 
its Jason3 satellite 

to measure Earth's 

sea levels, adding to 
knowledge of ocean 
circulation and climate 
change. 
go.nature.com/rqfqmh 


19-21 JANUARY 
The Festival of 
Genomics takes place 
in London, bringing 
together industrialists, 
academics and 
policymakers. 
go.nature.com/cw5hfb 


18-22 JANUARY 
PepTalk, dubbed “The 
Protein Science Week; 
convenes in San Diego, 
California. 
www.chi-peptalk.com 


had a mixed reception, with 
some complaints that they are 
‘nannying. See page 127 for 
more. 


FACILITIES 


Linear collider 
Japan should ramp up its 
expertise as it prepares to host 
the world’s next-generation 
particle smasher in the 2020s, 
reports the country’s High 
Energy Accelerator Research 
Organization (KEK) in 
Tsukuba. An action plan 
published on 6 January lays 
out the KEK’s goals for the 
preparation phases of the 
International Linear Collider, 
including a goal to triple the 
number ofhome-grown 
accelerator scientists and 
engineers. In 2012, Japanese 
researchers proposed hosting 
the 31-kilometre-long 
accelerator, which will smash 
electrons together with their 
antimatter partners. However, 
no government has yet 
promised any funding. 


> NATURE.COM 
For daily news updates see: 
www.nature.com/news 
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Vietnam begins huge effort 
to identify war dead 


World’s largest systematic identification project will use smart DNA-testing technology. 


BY ALISON ABBOTT 


igging foundations for temples or 
D schools, harvesting rice in paddy 
fields: these are some of the ways that 
the decaying remains of Vietnam War victims 
still turn up, 40 years after the conflict ended. 
Now an effort has begun that will use smart 
DNA technologies to identify the bones of the 
halfa million or more Vietnamese soldiers and 
civilians who are thought still to be missing. 
It is the largest ever systematic identifica- 
tion effort; only the identification of more than 
20,000 victims of armed conflicts in Bosnia and 


Herzegovina during the 1990s comes close. 

“When I was a 21-year-old in the medical 
corps there, I never imagined that such a pro- 
ject could ever become possible,’ says Vietnam 
veteran and genomics pioneer Craig Venter, 
head of the J. Craig Venter Institute in La Jolla, 
California. “We thought of body counts as sta- 
tistics — now, decades later, it may be possible 
to put names to them.” 

Although the United States has repatriated 
and identified most of its war dead, Vietnam 
has so far identified just a few hundred peo- 
ple, using outdated techniques. Yet people 
in Vietnam remain desperate to acquire the 


remains of family members. 

A few years ago, the government responded 
to their plight and asked the Advanced Interna- 
tional Joint Stock Company (AIC) in Hanoi to 
investigate how best to proceed. The AIC con- 
sulted medical-diagnostics company Bioglobe 
in Hamburg, Germany, on how to equip the 
Vietnamese labs and train their scientists. In 
2014, the Vietnamese government announced 
an investment of 500 billion dong (US$25 mil- 
lion) in the project and said that it would 
upgrade its three existing DNA-testing centres. 

This was great news, says Truong Nam Hai, 
head of the Institute of Biotechnology at 
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4 the Vietnam Academy of Science and 
Technology, which hosts the first DNA- 
testing laboratory to be upgraded. In the 
1990s, his institute proposed plans for iden- 
tifying the missing, he says. However, “due to 
difficult circumstances at the time’, these did 
not take off. 

Last month, the government signed a train- 
ing and consultancy contract with Bioglobe, 
which will allow the sequencing effort to start. 

“The technical challenges are considerable 
but tractable,’ says Bioglobe’s chief executive, 
Wolfgang Hoppner, who crafted the proposal 
for Vietnam. In the country’s hot and humid 
climate, DNA in bones that have lain in shallow 
graves for decades is likely to have degraded 
extensively. Moreover, contaminants from soil 
microbes can inhibit the enzymes that scien- 
tists use to amplify what little DNA remains 
to levels that can be analysed. And because of 
the large numbers of bones involved, the work 
needs to be done efficiently, adds Hoppner. 

H6ppner’s proposal makes use of kits from 
Germany-based biotech company Qiagen, 
which have been designed to protect and reveal 
as much DNA as possible when dealing with 
difficult sources such as old, buried bones, 
and are also amenable to automated, ‘high 
throughput’ processes. 

The identification process involves 
powdering bone samples and chemically break- 
ing down their cells. Before amplification, the 


DNA is extracted in sealed Qiagen cartridges 
that contain chemicals to wash away substances 
that could inhibit the process. Another Qiagen 
kit then checks the amplified DNA against a 
large set of genomic markers to create a DNA 
profile of the sample. The kit can also detect 
whether inhibitors are still present. 

In cases in which inhibitors prove stubborn, 
samples will be analysed manually by slower, 
more complex methods that have been opti- 
mized by an experienced forensic laboratory 
run by the International Commission on Miss- 
ing Persons (ICMP). That lab, in Bosnia and 
Herzegovina’s capital Sarajevo, led the effort to 
identify people killed during the 1990s con- 
flict, including nearly all of the 8,000 or so who 
were massacred in 1995 in Srebrenica. 


TRAINING BEGINS 
The ICMP will also have a role in training Viet- 
namese scientists. Truong’s lab will next month 
send six scientists on a three-month pro- 
gramme. They will spend most of their time 
in Hamburg focusing on DNA tests, but they 
will also have a stint at the ICMP to learn other 
critical aspects of identification: how to avoid 
jumbling bones from different skeletons when 
exhuming them from mass graves, or how to 
look for clues in bones that might aid identi- 
fication, such as pointers to height or gender. 
It was possible to extract useful levels of 
DNA from around 80% of the bones from the 


Srebrenica victims, says Thomas Parsons, head 
of the ICMP lab. The Vietnamese bones have 
been in the ground for longer and in a more 
damaging climate, but highly optimized meth- 
ods and careful selection of skeletal samples 
will help, he says. 

The Vietnam project will also need reference 
DNA from family members to compare with the 
bone DNA from victims. It plans to have an out- 
reach programme calling for people to donate 
saliva samples to create a reference data bank 
— but this will not be easy. Many war victims 
may have died too young to have had children, 
and their parents may also be dead, so reference 
samples will have to come from more distant 
relatives whose DNA is less similar. “That is 
why it is particularly important to do the DNA 
analysis with a larger than normal set of mark- 
ers,’ says HOppner. 

The outreach programme will also call for 
people to come forward with information on 
where bones might be buried. Unlike in Bos- 
nia, where investigators could in some cases 
use satellite imagery to identify mass graves, the 
Vietnamese effort will rely on witness reports, 
as well as on common and military knowledge. 

Once all three government DNA-testing 
centres are upgraded, probably by 2017, they 
will together be able to identify between 8,000 
and 10,000 people a year, says Truong. He also 
anticipates that the DNA project will improve 
Vietnam's scientific culture. m 


Tarwan’s SARS hero is 
poised for vice-presidency 


Epidemiologist who spearheaded response to outbreak is popular with scientists — and others. 


BY DAVID CYRANOSKI IN TAIPEI 


famous and influential epidemiologist, 
As Chien-Jen, is set to become 

Taiwan's vice-president after elections 
on 16 January. 

If he does, it is hoped that Chen — an 
epidemiologist looked upon as a hero for his 
role in subduing Taiwan's outbreak of severe 
acute respiratory syndrome (SARS) in 2003 — 
will help to infuse the new government with 
an air of integrity and collaboration, maintain 
good relations with China and stimulate ideas 
for revitalizing the economy. 

“He can negotiate with anyone, and is 
always trying to help,” says the National 
Taiwan University’s president, Yang Pan-Chyr. 
“You wouldn't think such a person would 


bea candidate for a politician” 

Chen announced in November that he 
would be the running mate for Tsai Ing- Wen, 
leader of the Democratic Progressive Party 
(DPP). If Tsai were to win, it would be only 
the second time in Taiwan's history that the 
ruling Kuomintang (KMT) party has been 
dethroned. 

Tsai is ahead in all the polls: she leads the 
KMT candidate by 30 percentage points, 
according to the non-profit Cross-Strait 
Policy Association, which carries out research 
on relations between Taiwan and the mainland 
— and the KMT’s own survey puts her lead at 
8 percentage points. 

Chen, too, is popular — the Cross-Strait 
Policy Association puts his ‘admiratiom rating 
at 54%, compared with 27% for his counterpart 
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in the KMT. This is probably a result of 
his celebrity status with regard to the SARS 
epidemic. 

Panic over the viral infection, which 
initially emerged in mainland China but 
quickly spread across many parts of the 
world, was exacerbated in Taiwan because the 
United Nations recognizes China's claim that 
Taiwan is part of China, and thus refuses to 
give it an independent seat at meetings of the 
World Health Organization. Excluded from 
international discussions and sample sharing, 
Taiwan’ outbreak spiralled out of control even 
as authorities elsewhere were getting a grip on 
the epidemic. 

It was Chen, who was appointed health 
minister as the epidemic was escalating in 
Taiwan, who headed containment efforts. He 
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Presidential candidate Tsai Ing-Wen has picked epidemiologist Chen Chien-Jen to be her running mate in the upcoming Taiwan election. 


bolstered attempts to isolate patients so as to 
prevent spread in hospitals, and boosted screen- 
ing for fever. Even today, mentioning his name 
can elicit an enthusiastic thumbs-up. “Chen is 
great,” a taxi driver in Taiwan told Nature in 
early January. “With SARS, he was so fast.” 

Chen is also popular in scientific circles, 
where he is known for other groundbreak- 
ing work. His research on the effects of ars- 
enic exposure led health agencies around the 
world to lower the levels deemed acceptable 
(C.-J. Chen et al. Br. J. Cancer 66, 888-892; 
1992), and his assessment of the risk of liver 
cancer in people with chronic hepatitis led to 
new treatment guidelines (C.-J. Chen et al. 
J. Am. Med. Assoc. 295, 65-73; 2006). An 
online petition supporting Chen’s candidacy 
has received more than 1,600 signatures — 
including those of prominent academics. 
“Within days, hundreds of names poured in, 
says Ming-Liang Lee, the former Taiwanese 
health minister who started the petition. 

Many researchers value Chen's personality. 
“He has the capacity and appeal to pull people 
together,’ says Ming-Chu Hsu, chief executive 
of TaiGen, one of Taiwan's most successful bio- 
technology companies. 

Chen carries a reassuring air of reliability. 
“He would be someone we can trust. Everyone 
seems to think so,’ says Yang. 

Chen himself told Nature that attributes 
honed during his time as a scientist — for 
example, the ability to solve problems — are 
beneficial to politics. He also said that it is 
crucial to revitalize Taiwan's stagnant econ- 
omy: increased competition in electronics 
from China and elsewhere has slashed the 
profits that once made Taiwan wealthy. 

Tsai has outlined five areas in which Taiwan 
can innovate: biopharmaceuticals, green 


CROSS-STRAIT COLLABORATION 


The number of science papers co-authored by 
researchers from Taiwan and from mainland 
China has quadrupled in the past ten years. 
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energy, big data, precision machinery and 
national defence. To bolster those aims, 
Chen plans to establish a research system that 
encourages researchers and entrepreneurs to 
take risks. “Now the government doesn't allow 
failure, so everyone goes for ‘me-too’ modifica- 
tions, not innovation,’ he told Nature. 

Scientists and technology-based indus- 
trialists say that Chen and Tsai’s intention to 
promote innovation could bring a much- 
needed focus on Taiwanese science, although 
advocates are trying to keep things in perspec- 
tive. “I think all science and technology would 
benefit from his taking office,’ says Yang. “But 
maybe we are expecting too much.” 

The DPP has traditionally emphasized 
Taiwanese autonomy, which riles Beijing, 
but “we don't want to be troublemakers’, says 
Chen. He acknowledges that he himself came 
up against the Chinese authorities during the 
SARS epidemic, but says that agreement on 


how to handle information on health and infec- 
tious diseases has largely resolved the issues. 

A continuation of the status quo suits 
neuroscientist Chiang Ann-Shyn at Taiwan's 
National Tsing Hua University; he expects 
Chen to act as an antidote to the DPP’s some- 
times provocative statements on independ- 
ence. “Relations with China have been good. 
I don’t think Chen will do anything radical,” 
he says. Two decades of stable relations fol- 
lowing a crisis in the mid-1990s — when the 
mainland tested missiles in the strait — have 
led to a boom in business between Taiwan and 
the mainland, and research collaborations 
between them have quadrupled in the past ten 
years (see ‘Cross-strait collaboratiom). 

Hsu agrees with Chiang; her company’s 
antibiotic against multi-drug-resistant Strepto- 
coccus pneumoniae was the first drug devel- 
oped in Taiwan to be submitted for approval 
on the mainland under new rules. “Health is 
one thing we can work on together,” she says. 

Chen becomes emotional when talking 
about the possible end of his research career. 
Until recently, he had assumed that this would 
be at the nation’s premier research organi- 
zation, the Academia Sinica, where he was 
vice-president until he declared himself Tsai’s 
running mate. 

But although he was at first reluctant to 
join the electoral race, he finally decided 
that improving Taiwan’s social and eco- 
nomic situation was more important than 
his research. 

A devout Catholic who consulted his arch- 
bishop before making his decision, Chen says 
that he considers his political career a “calling 
from God” He adds: “I told the people in my 
laboratory that, for the coming years, it’s more 
important that I serve the people.” m 
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EPIDEMIOLOGY 


Ebola hunters go 
after viral hideout 


As West Africa epidemic fades, researchers aim to prevent 
recurrences by finding the virus’s natural host. 


BY EWEN CALLAWAY 


ith the official end of Ebola 
transmission across West Africa 
anticipated on 14 January, an epi- 


demic that killed more than 11,000 people in 
2 years may be starting to fade into history. 
But that does not mean that Ebola has disap- 
peared. The virus remains hidden in animal 
reservoirs, and is almost certain to spill over 
into humans again. 

“We've got to focus on what could poten- 
tially happen next,’ says David Pigott, a 
spatial epidemiologist at the University of 


2 


TOP NEWS 


Oxford, UK — and that means uncovering 
the species that harbour Ebola in the wild to 
try to prevent deadly outbreaks in the future. 

It is no easy task. Since the disease first 
emerged in Zaire (now the Democratic 
Republic of the Congo) 40 years ago, efforts 
to trace the origins of the outbreaks, includ- 
ing the most recent one, have come up 
frustratingly empty. Wild gorillas and chim- 
panzees in central Africa have experienced 
occasional Ebola outbreaks. But like humans, 
these species are too ravaged by the virus to 
serve as its natural host. Experts say that a 
reservoir species is likely to harbour the virus 


only at low levels, and without becoming sick. 

The leading candidates are several species of 
fruit bat from across central and West Africa 
— where all known Ebola outbreaks have 
originated — that are often hunted for meat. A 
2005 study’ uncovered Ebola genetic material 
in some fruit bats from Gabon and the Demo- 
cratic Republic of the Congo, and detected 
Ebola antibodies in the blood of others. Mar- 
burg virus, which is closely related to Ebola, is 
thought to be transmitted by fruit bats. 

“T firmly believe fruit bats are the reservoir 
for Ebola,’ says Peter Daszak, a disease ecologist 
and president of EcoHealth Alliance, a conser- 
vation organization in New York City that plans 
to survey numerous bat species, including fruit 
bats, in Liberia for signs of Ebola infection. 

Other researchers believe that focus is 
too narrow. “The evidence for fruit bats is 
the strongest, but it’s still weak,” says Fabian 
Leendertz, a wildlife epidemiologist at the 
Robert Koch Institute in Berlin. 

Leendertz suspects another type of bat. He 
led a team that searched for the source of the 
latest West African outbreak in early 2014, a 
few months after a toddler in southern Guinea 
became the first human victim. The team cap- 
tured dozens of bats near the toddler’s village, 
but none — fruit-eating or otherwise — showed 
any conclusive signs of Ebola infection’. Still, 
circumstantial evidence has led the research- 
ers to suspect that the culprit may have been 
small insect-eating bats living in a tree near the 
toddler’s home. Although the tree had burned 
down before researchers arrived, it had been 
filled with such bats, and villagers told the team 
that children often played in its hollowed-out 
trunk. The team is now looking more closely 
at insectivorous bats, but Leendertz cautions 
against focusing on any one animal. 


UNUSUAL SUSPECTS 
Some researchers advise casting the net even 
wider. “I don’t buy the bat story for Ebola 
virus, not at all, says virologist Jens Kuhn of 
the US National Institute of Allergy and Infec- 
tious Diseases at Fort Detrick, Maryland. He 
thinks that bats are much too abundant and 
too closely associated with humans to explain 
an infection that has emerged just two dozen 
times over the past four decades. “It’s going to 
bea strange host,’ he says. Even arthropods or 
fungi could be possibilities, he adds. 

Others intend to look at more-familiar 


MORE 
ONLINE 


| MORE NEWS | | NATURE PODCAST 
Scientists bust @ The struggle to save the Middle Fibre and the 
myth that our East’s cultural treasures go.nature. bane 0aPaet microbiome, 
bodies have com/55vrco squeezing 
more bacteria @ Famous ‘Iceman’ had familiar quantum states, 
than human stomach infection go.nature.com/uncnwj and studying 
cells go.nature. @ Enzyme tweak boosts precision of boredom nature. 
com/2gxrbm CRISPR edits go.nature.com/3smglp com/nature/podcast 


138 | NATURE | VOL 529 | 14 JANUARY 2016 
© 2016 Macmillan Publishers Limited. All rights reserved 


PETE MULLER/NATL GEOGRAPHIC CREATIVE 


SUSUMU NISHINAGA/SPL 


species. The US Agency for International Devel- 
opment plans a two-year survey of animals 
ranging from rodents to livestock to domestic 
dogs and cats. These animals may not be natural 
reservoirs of Ebola, but they could contribute 
to spillovers into humans, says Dennis Carroll, 
director of the agency’s Pandemic Influenza and 
Other Emerging Threats Unit. 

But with so many question marks hovering 
over the identity of Ebola’s reservoirs, some sci- 
entists say that it is time to eschew virus hunting 
in specific creatures and instead pursue more- 
holistic approaches that examine ecological and 
anthropological factors common to spillovers. 

Tony Goldberg, an epidemiologist at the 
University of Wisconsin-Madison, is one such 
advocate. He no longer subscribes to the view 
that “we have to blanket the continent of Africa 


with field-deployable DNA sequencers and 
sample everything that crawls, flies or swims 
and eventually we'll come across it. I used to 
think that way,” he says, “but I’m cooling off to 
that approach” 

His team is studying how bush-meat hunt- 
ers interact with wild ecosystems to identify 
factors that might be linked to the spillover of 
zoonotic infections such as Ebola. 

Ina similar effort, a team led by Pigott and 
his colleague epidemiologist Simon Hay is 
looking at past outbreaks for common ecologi- 
cal factors, such as vegetation, elevation and 
the presence of suspected reservoir species 
such as fruit bats and carriers such as apes. By 
modelling these data, the team has created a 
map of areas at risk of Ebola spillovers’. 

And Barbara Han, a disease ecologist at the 
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Cary Institute of Ecosystem Studies in Mill- 
brook, New York, is using machine-learning 
techniques to predict which bat species are 
likely to harbour Ebola and related viruses 
because they share ecological factors common 
to suspected reservoir species. 

Research on Ebola therapies and vaccines 
saw an infusion of public and private funding 
during the epidemic, and scientists hunting 
the virus in the wild hope to capture the same 
sense of urgency and financial support. But 
they know that the job won't be easy. “It has lit 
a fire under people's butts, mine included,” says 
Goldberg. “The problem is, we're not sure what 
to do with the fire? m= 
1. Leroy, E. M. et al. Nature 438, 575-576 (2005). 


2. Saéz, A.M. et al. EMBO Mol. Med. 7, 17-23 (2014). 
3. Piggott, D. M. et al. eLife 3,e04395 (2014). 


EXOPLANETS 


Rebooted Kepler spacecraft 
hauls in the planets 


Worlds found by K2 mission push beyond original discoveries. 


BY ALEXANDRA WITZE, KISSIMMEE, FLORIDA 


hunter, NASA’s Kepler spacecraft is raking 

in exoplanet discoveries that are surpris- 
ingly different from those found during its 
first iteration. 

Between 2009 and 2013, Kepler became the 
most successful planet-hunting machine ever, 
discovering at least 1,030 planets and more 
than 4,600 possible others in a single patch 
of sky. When a mechanical failure stripped 
the spacecraft of its ability to point precisely 
among the stars, engineers reinvented it in 
2014 as the K2 mission, which looks at dif- 
ferent parts of the cosmos for shorter periods 
of time. 

In its first year of observing, K2 has netted 
more than 100 confirmed exoplanets, says 
astronomer Ian Crossfield at the University of 
Arizona in Tucson. They include a surprising 
number of systems in which more than one 
planet orbits the same star (E. Sinukoff et al. 
Preprint at http://arxiv.org/abs/1511.09213; 
2015). The K2 planets are also orbiting hotter 
stars than are many of the Kepler discoveries. 

“This is really showing the power and 
potential of K2,” says Crossfield. “These 
are things we never found with four years 
of Kepler data.” He and other scientists 
reported the findings last week at a meeting 
of the American Astronomical Society in 
Kissimmee, Florida. 


1E the second phase of its life as a planet 


The original Kepler mission was designed 
to answer a specific question: what fraction of 
Sun-like stars have Earth-sized planets around 
them? Unbound by those constraints — even 
if not as good at pointing itself — K2 has been 
able to explore wider questions of planetary 
origin and evolution. “Now we get to look at 
a much bigger variety,” says Steve Howell, the 
mission's project scientist at NASA’s Ames 
Research Center in Moffett Field, California. 

And because K2 looks at stars that are gen- 
erally brighter and closer to Earth than Kepler 
did, the exoplanets that the mission finds are 
likely to be the best-studied for the foreseeable 
future. This is because they are near enough to 
allow astronomers to explore them with other 
telescopes on Earth and in space. 


UNEXPECTED BOUNTY 
In the past year, K2 has uncovered not just 
planets — such as three super-Earths orbit- 
ing a single star — but also surprises such as 
the disintegrating remains of a planet swirling 
around a white dwarf star. It has even probed 
exploding stars — because K2 stares con- 
stantly at a patch of the sky, it is able to catch 
a supernova as it brightens instead of later in 
its explosion, as other telescopes typically do. 
Among the K2 planets confirmed so far, 
58 are singletons, 28 come from systems with 
at least 2 planets and 14 are triples, Crossfield 
says. In addition, K2 has unearthed more 
than 200 candidate planets, says Andrew 


Vanderburg, an astronomer at the Harvard 
Smithsonian Center for Astrophysics in Cam- 
bridge, Massachusetts. 

K2 observes a larger fraction of the cool 
stars known as M dwarfs — the most com- 
mon type of star in the Galaxy — than Kepler 
did. But surprisingly, fewer of the K2 planets 
are orbiting M-dwarf stars. A higher percent- 
age of them, at least so far, circle stars that are 
hotter and more like the Sun, says Courtney 
Dressing, an astronomer at the California 
Institute of Technology (Caltech) in Pasadena. 

K2 will begin a new type of planet-hunting 
on 7 April. Normally the spacecraft searches 
for a temporary dimming of a star caused 
when a planet crosses in front of it. For just 
under three months, however, it will look for 
the temporary brightening of cosmic objects, 
such as a galaxy, caused when a planet bends 
light as it crosses the line of sight between the 
object and the observer. The team expects to 
catch between 85 and 120 of these ‘microlens- 
ing’ planets during the campaign. 

The survey will involve other telescopes 
and be the first automated search to be done 
simultaneously from the ground and in 
space, says Calen Henderson, an astronomer 
at NASA's Jet Propulsion Laboratory in Pasa- 
dena, California. 

That means much more work ahead for 
mission scientists. “Kepler was one field and it 
ruined your summer, says Caltech astronomer 
David Ciardi. “K2 is ruining our whole year.” m 
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Physicists planning to build a neutrino detector in southern India have run into local opposition. 


Unfounded rumours delay 
Indian neutrino detector 


Fears frustrate physicists in a global competition to understand elusive particles. 


BY ELIZABETH GIBNEY 


ime is running out for Indian scientists 
| to build a facility that would let them 
compete in one of the hottest races in 
physics. The India-based Neutrino Obser- 
vatory (INO) — an effort to learn about the 
masses and other properties of mysterious 
particles called neutrinos — is under threat as 
a result of baseless rumours about its aims and 
environmental impact. Despite a government 
go-ahead in January 2015 to build a massive 
detector under a mountain in the southern 
state of Tamil Nadu, opposition from environ- 
mentalists and state politicians means that not 
a single grain of earth has been shifted. 
Neutrinos are abundant subatomic particles 
that are extremely hard to detect. Billions pass 
through each square centimetre of Earth every 
second, but barely any leave a trace. The INO 
would study neutrinos produced when cosmic 
rays strike the atmosphere, and would seek to 
reveal the relative masses of the three known 
types of neutrino. The measurements could 
lead to Nobel-prize-worthy insights into the 
relationship between nature’s four fundamental 


forces, as well as the imbalance between matter 
and antimatter in the Universe. 

But if the INO is not built soon, other pro- 
jects — including one that broke ground in 
China a year ago — may get there first, says 
D. Indumathi, a theorist at the Institute of Math- 
ematical Sciences in Chennai who is part of the 
INO collaboration, and coordinates outreach for 
it. “Longer than a year of delay and I think it will 
be difficult to have viable physics goals, at least 

of the current type,’ she says. 


“They don’t Conceived in 2001 and 
know what originally slated for com- 
thetruthis, pletion in 2012, the INO 
andIcan has faced a rocky path to 
understand construction. To shield the 
that.” enormous detector from 


the confounding zoo of 
subatomic particles that pummels Earth’s sur- 
face, the facility needs to be built more than a 
kilometre underground. The first earmarked 
site was ruled out in 2009 after a lengthy battle 
with conservationists over its proximity to an 
elephant and tiger reserve. 
The current site, in the Tamil Nadu district 
of Theni, faced opposition as soon as it was put 
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forward in 2010. Local villagers worried that 
the facility would deplete or contaminate their 
restricted water supply, and cut offaccess to land 
for grazing livestock, says Indumathi. But, she 
says, villagers consented after scientists assured 
them that the facility would not interfere with 
their resources. 

Since then, however, local environmental 
organizations and regional politicians have 
taken up the issue, and the list of objections 
has swelled to include fears that the lab will 
emit radiation and store nuclear weapons, and 
that the excavation will threaten a nearby dam. 

The rumours are untrue, says Naba Mondal, 
a physicist at the Tata Institute of Fundamental 
Research in Mumbai who leads the INO collab- 
oration. INO scientists have visited schools and 
held community meetings to counter miscon- 
ceptions. But many villagers have turned against 
the project. “They don't know what the truth is, 
and I can understand that,” says Mondal. 

At the root of the rumours is mistrust of 
the state and the scientific establishment, says 
Govind Krishnan, an Indian journalist who 
has closely followed the project. He believes 
that the fears that have been raised lie “in the 


TIFR 


realm of fantasy’, but are understandable given 
the poor environmental record of past state- 
sponsored construction projects. Govind disa- 
grees with activists who say that INO scientists 
have ignored the project's impact on the poor, 
but he says that scientists’ efforts have been 
hampered by class and linguistic barriers. 

India’s government allocated 15 billion 
rupees (US$225 million) to construction when 
it gave the INO the green light last year, but 
the Madras High Court in Chennai brought 
the project to a standstill in March following 
a petition from local activists and politicians. 
The court said that the Tamil Nadu Pollution 
Control Board must give consent before con- 
struction can start. This is normally a routine, 
45-day step, but the process has so far taken 
9 months, says Mondal. 

The politically contentious nature of the 
project means that the local board may well 
delay until after state elections in May. “Iam 
confident that it will eventually be approved, 
but the question is when,” says Mondal. The 
delay is damaging the morale of students and 
researchers on the project, he adds. 

Meanwhile, China expects to complete the 
Jiangmen Underground Neutrino Obser- 
vatory in 2019. To remain competitive, the 
INO must start construction in the next few 
months, says Mondal. “Science is something 
you have to do in time. If you are not in time, 


your results may not be that important.” 

But neutrino physicists say that even if the 
INO loses the race, its findings would help 
to corroborate discoveries at other detectors. 
The INO takes a unique approach — using 
50,000 tonnes of magnetized iron to separate 
atmospheric neutrino observations from their 
antineutrino counterparts. That will make its 
results interesting whenever they come out, 
says Mark Messier, a physicist at Indiana Uni- 
versity Bloomington and co-spokesperson for 
the NOvA Neutrino Experiment at Fermilab 
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in Batavia, Illinois, which also has a chance of 
solving the neutrino-mass mystery. 
Researchers point to other benefits, too. 
Putting a physics laboratory deep underground 
gives India the opportunity to host research 
into areas such as dark matter, they say — and 
itis empowering for Indian scientists to bring a 
major physics facility to fruition. “Already I've 
seen the tremendous difference it’s made to stu- 
dents having an experiment on which they call 
the shots,’ says Indumathi. “So I really don't care 
whether we get a Nobel prize or not” m 


CORRECTIONS 

The Editorial ‘Fishy limits’ (Nature 528, 435; 
2015) wrongly implied that the European 
Commission had set the fishing quotas. 

They were set by the Council of Ministers. 
The News story ‘Feuding physicists turn 

to philosophy’ (Nature 528, 446-447; 

2015) gave the wrong affiliation for Sabine 
Hossenfelder; she is now at the Frankfurt 
Institute for Advanced Studies. The News 
Feature ‘How to make the most of carbon 
dioxide’ (Nature 526, 628-630; 2015) said 
that Carbon Recycling International produces 
1.5% of global methanol; in fact, it makes 
0.005%. The News Feature ‘Space. Time. 
Entanglement’ (Nature 527, 290-293; 2015) 


wrongly said that Leonard Susskind began 

to think about computational complexity 

ten years ago — his work in the area began 
around three years ago. The News Feature 
‘The truth about fetal tissue research’ (Nature 
528, 178-181; 2015) incorrectly stated 

that around 5.8 billion people have received 
vaccines made with the WI-38 and MRC-5 
cell lines. In fact, companies have shipped 
some 5.8 billion vaccines made with these 
two cell lines. And a printing error meant that 
an earlier version of the News article ‘What 

to look out for in 2016’ (Nature 529, 14-15; 
2016) appeared that did not account for the 
fact that NASA has cancelled the 2016 launch 
of the Mars InSight probe. 
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A group of young Tibetan monks huddles on a degraded pasture on the Tibetan Plateau. 


TROUBLE IN 


TIBE 


Rapid changes in Tibetan grasslands are threatening 
Asia’s main water supply and the livelihood of nomads. 


BY JANE QIU — milk, butter, meat and fuel. Dodra was forced to give up half of his 
animals a decade ago, when the Chinese government imposed strict 
n the northern reaches of the Tibetan Plateau, dozens of yaks limits on livestock numbers. Although his family receives financial 
graze on grasslands that look like a threadbare carpet. The compensation, nobody knows how long it will last. 
pasture has been munched down to bare soil in places, and deep “We barely survive these days,” he says. “It’s a hand-to-mouth 
cracks run across the snow-dusted landscape. The animals’ existence.” If the grasslands continue to deteriorate, he says, “we will 
owner, a herder named Dodra, emerges from his home wearing a lose our only lifeline”. 
black robe, a cowboy hat and a gentle smile tinged with worry. The challenges that face Dodra and other Tibetan herders are at odds 
“The pastures are in a bad state and lack the kind of plants that with glowing reports from Chinese state media about the health of 
make livestock strong and grow fat,’ says Dodra. “The yaks are skinny ‘Tibetan grasslands — an area of 1.5 million square kilometres — and 
and produce little milk” the experiences of the millions of nomads there. Since the 1990s, the 
His family of eight relies on the yaks for most of its livelihood government has carried out a series of policies that moved once-mobile 
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herders into settlements and sharply limited livestock grazing. 
According to the official account, these policies have helped to restore 
the grasslands and to improve standards of living for the nomads. 

But many researchers argue that available evidence shows the 
opposite: that the policies are harming the environment and the 
herders. “Tibetan grasslands are far from safe,’ says Wang Shiping, 
an ecologist at the Chinese Academy of Sciences’ (CAS) Institute 
of Tibetan Plateau Research (ITPR) in Beijing. “A big part of the 
problem is that the policies are not guided by science, and fail to take 
account of climate change and regional variations.” 

The implications of that argument stretch far beyond the Tibetan 
Plateau, which spans 2.5 million square kilometres — an area bigger 
than Greenland — and is mostly controlled by China. The grasslands, 
which make up nearly two-thirds of the plateau, store water that 
feeds into Asia’s largest rivers. Those same pastures also serve as a 
gigantic reservoir of carbon, some of which could escape into the 
atmosphere if current trends continue. Degradation of the grasslands 
“will exacerbate global warming, threaten water resources for over 
1.4 billion people and affect Asian monsoons’, says David Molden, 
director general of the International Centre for Integrated Mountain 
Development (ICIMOD) in Kathmandu, Nepal. 

Such concerns propelled me to make a 4,700-kilometre journey 
last year from Xining, on the northeastern fringe of the plateau, to 
Lhasa in the Tibetan heartland (see “Trek across Tibet’). Meeting 
with herders and scientists along the way, I traversed diverse 
landscapes and traced the Yellow and Yangtze Rivers to their sources. 
The trip revealed that Tibetan grasslands are far less healthy than 
official government reports suggest, and scientists are struggling to 
understand how and why the pastures are changing. 


FENCED IN 

It began to drizzle soon after we set off from the city of Xining on a 
stretch of newly built highway along the Yellow River. As our Land 
Cruiser climbed onto a 3,800-metre-high part of the plateau, the 
vista opened to reveal rolling hills blanketed by a thick layer of alpine 
meadow, resembling a gigantic golf course. We passed herds of sheep 
and yaks, white tents and nomads in colourful robes — along with 
barbed-wired fences that cut the rangeland into small blocks. 

This part of the Tibetan Plateau, in a region known as Henan 
county, is blessed with abundant monsoonal rains every summer. The 
herders who live here are able to maintain healthy livestock and can 
make a decent living. “We have plenty to go around, and the livestock 
are well taken care of? says herder Gongbu Dondrup. 

But life has been different since the government began to fence up 
grasslands around a decade ago, says Dondrup. Before that, he took 
his herd to the best pastures at high elevations in the summer, and 
then came back down in the winter. Now, he must keep the yaks in an 
80-hectare plot that the government assigned to his family. The pasture 
looks worn, and he is being pressed by the government to further 
downsize his herd. “I don’t know how long it can keep us going,” he says. 

The fencing initiative is the latest of a string of Chinese grassland 
policies. After annexing Tibet in 1950, the young revolutionary 
Chinese republic turned all livestock and land into state properties. 
Large state farms competed with each other to maximize production, 
and livestock numbers on the plateau doubled over two decades, 
reaching nearly 100 million by the late 1970s. But in the 1980s, as 
China moved towards a market-based economy, Beijing swung to 
the other extreme: it privatized the pastures and gave yaks back to 
individual households, hoping that the move would push Tibetans to 
better manage their land and so boost its productivity. 

Despite the privatization, nomads continued to use the rangeland 
communally — often in groups led by village elders. Then the 
government began to limit herds, and it built fences to separate 
households and villages. “This has totally changed the way livestock 
are traditionally raised on the plateau, turning a mobile lifestyle into a 
sedentary existence,’ says Yang Xiaosheng, director of Henan county's 
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rangeland-management office. 

The fencing policy does have merits when applied in moderation, 
says YOnten Nyima, a Tibetan policy researcher at Sichuan University 
in Chengdu. Because an increasing number of nomads now lead a 
settled life — at least for parts of the year — it helps to control the level 
of grazing in heavily populated areas, he says. “Fencing is an effective 
way to keep animals out of a patch of meadow.’ Many herders also 
say that it makes life much easier: they do not have to spend all day 
walking the hills to herd their yaks and sheep, and if they go away for a 
few days, they don't worry about the animals running off. 

But the convenience comes at a cost, says Cao Jianjun, an ecologist 
at Northwest Normal University in Lanzhou. Fenced pastures 
often show signs of wear after a few years. In a 2013 study, Cao and 
his colleagues measured growth of the sedge species preferred by 
livestock in two scenarios: enclosed pastures and much larger patches 
of land jointly managed by up to 30 households. Despite similar 
livestock densities in both cases, the sedge grew twice as fast in the 
larger pastures, where animals could roam and plants had more 
opportunity to recover’. That matches the experience of Henan 
county herders, who say that their land sustains fewer animals than it 
has in the past. 


WATER WORRIES 

The future of the grasslands looked even bleaker as we left relatively 
well-to-do Henan county and ventured into the much higher, arid 
territory to the west. After 700 kilometres, we reached Madoi county, 
also known as gianhu xian (‘county of a thousand lakes’), where the 
Yellow River begins. Although this region gets only 328 millimetres of 
rain on average each year, about half of what Henan receives, Madoi 
was once one of the richest counties on the plateau — famous for its 
fish, high-quality livestock and gold mines. 

Now, the wetlands are drying up and sand dunes are replacing the 
prairies, which means that less water flows into the Yellow River. Such 
changes on the plateau have contributed to recurring water shortages 
downstream: the Yellow River often dries up well before it reaches the 
sea, an event not recorded before 1970. 

In 2000, China sought to protect this region, along with adjacent 
areas that give rise to the Yangtze and Mekong Rivers, by establishing 
the Sanjiangyuan (or Three-Rivers’ Headwaters) National Nature 
Reserve, an area nearly two-thirds the size of the United Kingdom. 


“THE POLICIES ARE NOT GUIDED BY SCIENCE, 
AND FAIL TO TAKE ACCOUNT OF CLIMATE 
CHANGE AND REGIONAL VARIATIONS.” 


Nearly one-tenth of the reserve area falls into core zones in which all 
activities, including herding, are prohibited. The government spends 
hundreds of millions of US dollars each year on moving nomads out 
of those core areas, constructing steel meshes to stabilize the slopes 
and planting artificially bred grass species to restore the eroded land. 
Outside the core regions, officials have banned grazing on ‘severely 
degraded grasslands, where vegetation typically covers less than 25% 
of the ground. Land that is ‘moderately degraded; where vegetation 
coverage measures 25-50%, can be grazed for half of the year. 

Such policies — and related initiatives to limit livestock numbers 
and fence off areas of pasture — have not been easy on the herders, 
says Guo Hongbao, director of the livestock-husbandry bureau in 
Nagchu county in the southern Tibetan Plateau. “The nomads have 
made sacrifices for protecting the grasslands,” he says. But he also 
says that the strategies have paid off. Guo and other officials point to 
satellite studies showing that the plateau has grown greener in the past 
three decades’. This increase in vegetation growth, possibly the result 
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of a combination of grazing restrictions and climate change, “has 
had a surprisingly beneficial effect on climate by dampening surface 
warming’, says Piao Shilong, a climate modeller at Peking University. 

But ecologists say that such measurements look only at surface 
biomass and thus are not a good indicator of grassland health. “Not 
all vegetation species are equal,’ says Wang. “And satellites can’t see 
what’s going on underground” 

This is particularly important in the case of the sedge species that 
dominate much of the Tibetan Plateau, and that are the preferred food 
of livestock. These species, part of the Kobresia genus, grow only 
2 centimetres above the surface and have a dense, extensive root mat 
that contains 80% of the total biomass. 

Studies of pollen in lake sediments show that Kobresia and other 
dominant sedges emerged about 8,000 years ago, when early Tibetans 
began burning forests to convert them to grasslands for livestock’. 
The prehistoric grazing helped to create the thick root mat that 
blankets the vast plateau and that has stored 18.1 billion tonnes of 
organic carbon. 

But Kobresia plants are being driven out by other types of vegetation, 
and there is a risk that the locked-up carbon could be released and 
contribute to global warming. Every now and then on the trip to Lhasa, 
we passed fields blooming with the beautiful red and white flowers of 
Stellera chamaejasme, also known as wolf poison. “It’s one ofa dozen 
poisonous species that have increasingly plagued China’s grasslands,” 
says Zhao Baoyu, an ecologist at the Northwest Agriculture and 
Forestry University in Yangling. Zhao and his colleagues estimated that 
poisonous weeds have infested more than 160,000 square kilometres of 
the Tibetan grasslands, killing tens of thousands of animals a year’. 

Herders also report seeing new grass species and weeds emerge 
in recent years. Although most are not toxic, they are much less 
nutritious than Kobresia pastures, says Karma Phuntsho, a specialist 
on natural-resource management at ICIMOD. “Some parts of the 
plateau may seem lush to an untrained eye,’ he says. “But it’s a kind of 
‘green desertificatiom that has little value.” 

In one unpublished study of the northeastern Tibetan Plateau, 
researchers found that Kobresia pastures that had gone ungrazed for 
more than a decade had been taken over by toxic weeds and much 
taller, non-palatable grasses: the abundance of the sedge species had 
dropped from 40% to as low as 1%. “Kobresia simply doesn't stand 
a chance when ungrazed,’ says Elke Seeber, a PhD student at the 
Senckenberg Natural History Museum in Gérlitz, Germany, who 
conducted the field experiment for a project supported by the German 
Research Foundation (DFG). 


“HAVING A SWEEPING GRAZING POLICY 
REGARDLESS OF GEOGRAPHICAL 
VARIATIONS IS A RECIPE FOR DISASTERS.” 


The changes in vegetation composition have important 
implications for long-term carbon storage, says project member 
Georg Guggenberger, a soil scientist at Leibniz University of Hanover 
in Germany. In moderately grazed Kobresia pastures, up to 60% of 
the carbon that is fixed by photosynthesis went into the roots and soil 
instead of the above-ground vegetation — three times the amount 
seen in ungrazed plots’. This underground organic carbon is much 
more stable than surface biomass, which normally decomposes 
within a couple of years and releases its stored carbon into the 
air. Soa shift from Kobresia sedge to taller grasses on the plateau 
will ultimately release a carbon sink that has remained buried for 
thousands of years, says Guggenberger. 

Critics of the grazing restrictions in Tibet say that the government 
has applied them in a blanket way, without proper study and without 
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taking on board scientific findings. In some cases, they make sense, 
says Tsechoe Dorji, an ecologist at the ITPR’s Lhasa branch, who grew 
up ina herder family in western Tibet. “A total grazing ban can be 
justified in regions that are severely degraded’, he says, but he objects 
to the simple system used by the government to classify the health 
of the grasslands. It only considers the percentage of land covered by 
vegetation and uses the same threshold for all areas, without adjusting 
for elevation or natural moisture levels. 

“Pastures with 20% vegetation cover, for instance, could be severely 
degraded at one place but totally normal at another,” says Dorji. 
This means that some of the grasslands that are classified as severely 
degraded are actually doing fine — and the grazing ban is actually 
hurting the ecosystem. “Having a sweeping grazing policy regardless 
of geographical variations is a recipe for disasters,’ he says. 


FAST FORWARD 

China's grazing policy is only one of several factors responsible 
for such damaging changes, say the researchers. Pollution, global 
warming and a rash of road-building and other infrastructure- 
construction projects have all taken a toll on the grasslands. 

Ten days after leaving Xining, we caught a glimpse of Tibet's future 
when we arrived at Nam Tso, a massive glacial lake in the southern 
part of the plateau. Here Dorji and Kelly Hopping, a graduate student 
at Colorado State University in Fort Collins, have been turning the 
clock forward by surrounding small patches of grassland with open- 
topped plastic chambers that artificially raise the temperature. These 
experiments are important because Tibet is a hotspot in terms of 
climate change; the average temperature on the plateau has soared by 
0.3-0.4°C per decade since 1960 — about twice the global average. 

In trials over the past six years, they found that Kobresia pygmaea, 
the dominant sedge species, develops fewer flowers and blooms much 
later under warming conditions®. Such changes, says Dorji, “may 
compromise its reproductive success and long-term competitiveness”. 

At the experimental site, the artificially warmed pastures have 
been taken over by shrubs, lichens, toxic weeds and non-palatable 
grass species, says Hopping. But when the researchers added snow 
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Near the headwaters of the 
Yellow River, lush grasslands 
have given way to sand dunes. 


to some heated plots, Kobresia did not lose out to the other plants, 
which suggests that the loss of soil moisture might be driving the shift 
in species. Higher temperatures increase evaporation, which can be 
especially potent at high elevations. “This is not good news for species 
with shallow roots’, such as the Kobresia favoured by livestock, she says. 
Piao says that “this interplay between temperature and precipitation 
illustrates the complexity of ecosystem responses to climate change”. 
But researchers have too little information at this point to build 
models that can reliably predict how global warming will affect the 
grasslands, he says. To fill that gap, Wang and his colleagues started a 
decade-long experiment in 2013 at Nagchu, where they are using heat 
lamps to warm patches of grassland by precise amounts, ranging from 
0.5°C to 4°C. They are also varying the amount of rainfall on the 
plots, and they are measuring a host of factors, such as plant growth, 
vegetation composition, nutrient cycling and soil carbon content. 
They hope to improve projections for how the grasslands will 
change — and also to determine whether there is a tipping point that 
would lead to an irreversible collapse of the ecosystem, says Piao. 


PLATEAU PROGNOSIS 

A fortnight into the trip, we finally arrived at the outskirts of Lhasa. 
At the end of the day, herders were rounding up their sheep and 

yaks in the shadows cast by snow-capped peaks. They and the other 
pastoralists across the plateau will have a difficult time in coming 
decades, says Nyima. Climate change was not a consideration when 
grassland polices were conceived over a decade ago, and so “many 
pastoralists are ill prepared for a changing environment’, he says. 
“There is a pressing need to take this into account and identify sound 
adaptation strategies.” 

Asa start, researchers would like to conduct a comprehensive 
survey of plant cover and vegetation composition at key locations 
across different climate regimes. “The information would form the 
baseline against which future changes can be measured,’ says Wang. 
Many scientists would also support changes to the grazing ban and 
fencing policies that have harmed the grasslands. Dorji says that the 
government should drop the simplistic practice of ‘one policy fits all 
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across the plateau and re-evaluate whether individual regions are 
degraded enough to merit a ban on grazing. “Unless the pastures 
are severely degraded, moderate grazing will help to restore the 
ecosystems,’ he says. 

But scientists are not banking on such reforms happening soon. 
Policies in Tibet are driven less by scientific evidence than by 
bureaucrats’ quest for power and funds, says a Lhasa-based researcher 
who requests anonymity for fear of political repercussions. Local 
officials often lobby Beijing for big investments and expensive projects 
in the name of weiwen (meaning ‘maintaining stability’). Because 
resistance to Chinese control over Tibet continues to flare up, the 
government is mostly concerned with maintaining political stability, 
and it does not require local officials to back up plans with scientific 
support, says the researcher. “As long as it’s for weiwen, anything goes.” 

But officials such as Guo say that their policies are intended to help 
Tibet. “Although there is certainly room for improvement in some of 
the policies, our primary goals are to promote economic development 
and protect the environment; he says. 

Far away from Lhasa, herders such as Dodra say that they are not 
seeing the benefits of government policies. After we finish our visit 
at his home, Dodra’s entire family walks us into the courtyard — his 
mother in-law spinning a prayer wheel and his children trailing behind. 
It has stopped snowing, and the sky has turned a crystal-clear, cobalt 
blue. “The land has served us well for generations,’ says Dodra as he 
looks uneasily over his pasture. “Now things are falling apart — but we 
don't get a say about how best to safeguard our land and future? = 


Jane Qiu is a freelance writer in Beijing. Her trip across the Tibetan 
Plateau was supported by the SciDev.Net Investigative Science 
Journalism Fellowship for the Global South. 
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BOREDOM GETS INTERESTING 


Implicated in everything from traumatic brain injury to learning ability, 


boredom turns out to be anything but boring. 


BY MAGGIE KOERTH-BAKER 


n 1990, when James Danckert was 18, his 
older brother Paul crashed his car into a 
tree. He was pulled from the wreckage with 
multiple injuries, including head trauma. 
The recovery proved difficult. Paul had 
been a drummer, but even after a broken wrist 
had healed, drumming no longer made him 
happy. Over and over, Danckert remembers, 
Paul complained bitterly that he was just — 
bored. “There was no hint of apathy about it at 
all? says Danckert. “It was deeply frustrating 
and unsatisfying for him to be deeply bored 
by things he used to love.” 
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A few years later, when Danckert was 
training to become a clinical neuropsycholo- 
gist, he found himself working with about 
20 young men who had also suffered traumatic 
brain injury. Thinking of his brother, he asked 
them whether they, too, got bored more easily 
than they had before. “And every single one of 
them,” he says, “said yes” 

Those experiences helped to launch 
Danckert on his current research path. Now 
a cognitive neuroscientist at the Univer- 
sity of Waterloo in Canada, he is one of a 
small but growing number of investigators 
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engaged in a serious scientific study of boredom. 

There is no universally accepted definition of boredom. But whatever 
it is, researchers argue, it is not simply another name for depression or 
apathy. It seems to be a specific mental state that people find unpleas- 
ant — a lack of stimulation that leaves them craving relief, with a host 
of behavioural, medical and social consequences. 

In studies of binge-eating, for example, boredom is one of the most 
frequent triggers, along with feelings of depression and anxiety’. In 
a study of distractibility using a driving simulator, people prone to 
boredom typically drove at higher speeds than other participants, took 
longer to respond to unexpected hazards and drifted more frequently 
over the centre line*. And in a 2003 survey, US teenagers who said that 
they were often bored were 50% more likely than their less-frequently 
bored peers to later take up smoking, drinking and illegal drugs’. 

Boredom even accounts for about 25% of variation in student achieve- 
ment, says Jennifer Vogel-Walcutt, a developmental psychologist at the 
Cognitive Performance Group, a consulting firm in Orlando, Florida. 
That's about the same percentage as is attributed to innate intelligence. 
Boredom is “something that requires significant consideration’, she says. 

Researchers hope to turn such hints into a deep understanding of 
what boredom is, how it manifests in the brain and how it relates to fac- 
tors such as self-control. But “it’s a ways out before we're answering those 
questions’, says Shane Bench, a psychologist who studies boredom in 
the lab of Heather Lench at Texas A&M University in College Station. In 
particular, investigators need better ways to measure boredom and more 
reliable techniques for making research subjects feel bored in the lab. 

Still, the field is growing. In May 2015, the University of Warsaw drew 
almost 50 participants to its second annual conference on boredom, 
which attracted international speakers from social psychology and soci- 
ology. And in November, Danckert brought together about a dozen inves- 
tigators from Canada and the United States for a workshop on the subject. 

Researchers in fields from genetics to philosophy, psychology and 
history are starting to work together on boredom research, says John 
Eastwood, a psychologist at York University in Toronto, Canada. “A crit- 
ical mass of people addressing similar issues creates more momentum.” 


A MEASURE OF MALAISE 

The scientific study of boredom dates back to at least 1885, when the 
British polymath Francis Galton published’ a short note in Nature on 
‘The Measure of Fidget’ — his account of how restless audience mem- 
bers behaved during a scientific meeting. But decades passed with only 
a few people taking a serious interest in the subject. “There are things 
all around us that we don’t think to look at, maybe because they appear 
trivial” says Eastwood. 

That began to change in 1986, when Norman Sundberg and Richard 
Farmer of the University of Oregon in Eugene published their Boredom 
Proneness Scale (BPS)°, the first systematic way for researchers to meas- 
ure boredom — beyond asking study participants, “Do you feel bored?”. 
Instead, they could ask how much participants agreed or disagreed with 
statements such as: “Time always seems to be passing slowly’, “I feel that 
Iam working below my abilities most of the time” and “T find it easy to 
entertain myself”. (The statements came from interviews and surveys 
that Sundberg and Farmer had conducted on how people felt when they 
were bored.) A participant’s aggregate score would give a measure of his 
or her propensity for boredom. 

The BPS opened up new avenues of research and made it apparent 
that boredom was about restlessness as much as apathy, the search for 
meaning as much as ennui. It has served as a launching point for other 
boredom scales, a catalyst for making the field more important and a 
tool for connecting boredom to other factors, including mental health 
and academic success. 


But it also has some widely acknowledged NATURE.COM 
flaws, says Eastwood. One is that the BPS is a __Tofind your score 
self-reported measure, which means that it is onthe Boredom 


Proneness Scale, see: 
go.nature.com/xpgwok 


inherently subjective. Another is that it measures 
susceptibility to boredom — ‘trait boredonyY — not 
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the intensity of the feeling in any given situation, which is known as 
state boredom. Studies consistently show that these two measures are 
independent of each other, yet researchers are only beginning to tease 
them apart. 

This can be particularly confounding in educational settings. Shifts in 
teaching style or classroom environment are unlikely to reduce students’ 
trait boredom, which is intrinsic and slow to change, but can be very 
effective at reducing state boredom, which is purely situational. The BPS 
has often been misused to measure both forms of boredom at the same 
time, yielding answers that are likely to be misleading, says Eastwood. 

Scientists are still hashing out how to improve on the BPS. In 2013, 


WHEN SHE WAS WORKING ON HER 
DISSERTATION, SHE DECIDED TO MAKE A VIDEO 
THAT WOULD BORE MOST PEOPLE TO TEARS. 


Eastwood helped to develop the Multidimensional State Boredom Scale 
(MSBS)’, which features 29 statements about immediate feelings, such 
as: “I am stuck in a situation that I feel is irrelevant.” Unlike the BPS, 
which is all about the participant’s habits and personality, the MSBS 
attempts to measure how bored people feel in the moment. And that, 
Eastman hopes, will give it a better shot at revealing what boredom is 
for everybody. 

But to measure boredom, researchers must first make sure that study 
participants are bored. And that is a whole different challenge. 


THE MOST BORING VIDEO EVER 

One way to create a particular mood, used for decades in psychology, is 
to show people a video clip. There are scientifically validated videos for 
inducing happiness, sadness, anger, empathy and many other emotions. 
So when she was working on her dissertation at Waterloo in 2014, Colleen 
Merrifield decided to make a video that would bore most people to tears. 

In Merrifield’s video, two men stand in a white, windowless room. 
Silently, they take clothes from a pile between them and hang them on 
a white rack — a camisole, a shirt, a sweater, a sock. The seconds tick 
by: 15, 20, 45, 60. The men keep hanging laundry. Eighty seconds. One 
of the men asks the other for a clothes peg. One hundred seconds. They 
keep hanging laundry. Two hundred seconds. They keep hanging laun- 
dry. Three hundred seconds. They keep hanging laundry. Shown on a 
loop, the video can last for as long as five and a half minutes. 

Perhaps unsurprisingly, the people to whom Merrifield showed this 
found it stupefyingly dull®. But then she tried using the video to study 
how boredom affected the ability to focus and pay attention. Her pro- 
tocol called for participants to carry out a classic cognitive attention 
task — watching for star-like light clusters to appear or disappear on 
a monitor — then to sit through the video to get good and bored, and 
finally to do the task again so that she could see how boredom affected 
their performance. But she found that she had to redesign the experi- 
ment: the task was boring people more than the video. 

This was not entirely unexpected. Previous studies of boredom had 
often used tasks instead of videos. But it also demonstrated the problem. 
There are so many ways for researchers to bore people with tasks — 
asking them to proofread address labels, say, or to screw nuts and bolts 
together — that it had always been difficult to compare individual stud- 
ies. For instance, different studies have found boredom to be corre- 
lated with both rising and falling heart rate’. But without a standardized 
method for inducing boredom, it is impossible to work out who is right. 

In 2014, researchers at Carnegie Mellon University in Pittsburgh, 
Pennsylvania, published a paper’ that aimed to begin the process of 
standardization. It compared six different boredom inductions, rep- 
resenting three broad classes — repetitive physical tasks, simple 
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cognitive tasks, and video or audio media — as well as a control video. 
The researchers used the MSBS to see how intensely each task elicited 
boredom, and a measure called the Differential Emotion Scale to see 
whether each task elicited boredom alone, or a number of other emo- 
tions. All six tasks were significantly more boring than the control and 
all six caused boredom almost exclusively. The best of the bunch was a 
task that required participants to click a mouse button to rotate a com- 
puter icon of a peg a quarter of a turn clockwise, over and over. 

After that, says Danckert, “I think I might be abandoning the video” 
to induce boredom in the lab. Instead, he will rely on behavioural tasks. 

The inexactness of the tools leaves holes in what researchers can 
reasonably say about boredom. For instance, many real-world prob- 
lems that are highly correlated with boredom are connected to the idea 
of self-control, including addiction, gambling and binge-eating””. “I 
characterize boredom as a deficiency in self-regulation,’ Danckert says. 
“Tt’s a difficulty of engaging with tasks in your environment. The more 
self-control you have, the less likely you are to be bored.” 

But does this mean that self-control and boredom are measures of the 
same thing? Even Danckert is uncertain. Consider people with a history of 
traumatic brain injury. “Failures of self-control are their problem,” he says. 
“They might be inappropriately impulsive; there's increased risk-taking; 
they might also engage in drug and alcohol abuse.” Danckert certainly 
saw his brother, Paul, experience all those things in the wake of his injury. 

But in Danckert’s research sample of people with traumatic brain 
injury — who are predominantly in their 40s — ageing seems to have 
weakened the link between boredom and self-control. In data that are not 
yet published, Danckert says, his patients report levels of self-control no 
lower than those of the general population, but their boredom-proneness 
scores are much higher. By contrast, Danckert’s brother seems to demon- 
strate the opposite effect. He struggled for years with self-control issues, 
but eventually became less bored and reclaimed his love of music. “It’s 
the most important thing in his life, next to his children,” Danckert says. 

So there is reason to suspect that boredom and self-control can exist 
independently — but there is not yet enough evidence to understand 
much beyond that. 


PAINFULLY DULL 

Despite all this uncertainty, researchers see themselves as laying a 
foundation, creating tools and standards that will allow them to tackle 
really important questions. “We're establishing boredom as a testable 
construct,’ says Bench. 

Defining boredom is an important part of that. Different researchers 
have different pet definitions: a German-led team, for example, identifies 
five types of boredom". But most workers in the field agree that, at least 
some of the time, people will work very hard to relieve boredom. This 
not only presents a more active version of boredom than most people are 
probably used to, but also has tangible connections to efforts to address 
boredom in the real world. 

Lench and Bench are testing whether the drive to become un-bored is 
so strong that people might be willing to choose unpleasant experiences 
as an alternative. This idea builds on research that has shown a correla- 
tion between sensation-seeking behaviour, even risky behaviour, and 
high boredom-proneness scores”. It is also similar to findings published 
in Science’ in 2014 and Appetite’ in 2015. In the first study, researchers 
asked people to sit in a room with nothing to do for as long as 15 minutes 
at a time. Some of the participants, particularly men, were willing to 
give themselves small electric shocks rather than be left alone with their 
thoughts. The second paper described two experiments: one in which 
the participants had access to unlimited sweets, and another in which 
they had access to unlimited electric shocks. Participants ate more when 
they were bored — but they also gave themselves more shocks. Even 
when it is not very pleasant, apparently, novelty is better than monotony. 

Novelty might also have a role in overcoming boredom in the 
classroom. In 2014, for instance, researchers led by psychologist 
Reinhard Peckrun of the University of Munich in Germany reported”® 
how they had followed 424 university students over the course of an 
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academic year, measuring their boredom levels and documenting their 
test scores. The team found evidence ofa cycle in which boredom begot 
lower exam results, which resulted in more disengagement from class 
and higher levels of boredom. Those effects were consistent throughout 
the school year, even after accounting for students’ gender, age, interest 
in the subject, intrinsic motivation and previous achievement. But other 
studies suggest that novelty can disrupt this cycle’’. 

Sae Schatz, director of the Advanced Distributed Learning Initiative, 
a virtual company that develops educational tools for the US Depart- 
ment of Defense, points to one experiment” with a computer system 
that tutored students in physics. When the system was programmed to 
insult those who got questions wrong and snidely praise those who got 
them right, says Schatz, some students, especially adult learners, saw 
improved outcomes and were willing to spend longer on the machines. 
Schatz thinks that this could be because the insults provided enough 


"WE'RE ESTABLISHING BOREDOM AS A 
TESTABLE CONSTRUCT.” 


novelty to keep people engaged and less prone to boredom. 

Looking to the future, researchers such as Eastwood are intent on 
finding better ways to understand what boredom is and why it is cor- 
related to so many other mental states. They also want to investigate 
boredom in people who aren't North American college students. That 
means testing older people, as well as individuals from diverse ethnic 
and national backgrounds. And, given the impact that boredom may 
have on education, it also means developing versions of the BPS and 
MSBS that can be administered to children. 

Many researchers likewise hope to expand on the types of study being 
done. To get beyond self-reported data, Danckert wants to start looking 
at brain structures, and seeing whether there are differences between 
people who score highly on the BPS and those who don't. These data 
could help him to understand why boredom manifests so strongly in 
some people with traumatic brain injury. 

There's also a need, Danckert says, for more scientists to realize that 
boredom is fascinating. “We may be on the cusp of having enough 
people to advance a little more quickly,’ he says. m 


Maggie Koerth-Baker is a freelance writer in Minneapolis, Minnesota. 
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Drums containing contaminated materials from the US nuclear-defence programme are stored at the Waste Isolation Pilot Plant in New Mexico. 


Reassess New Mexico’s 
nuclear-waste repository 


Proposals to bury plutonium from nuclear weapons must address chemical interactions 
and intrusion risks, say Cameron L. Tracy, Megan K. Dustin and Rodney C. Ewing. 


ore than 600 metres below ground 
Mec Carlsbad, New Mexico, is 

the world’s only operating deep 
geological repository currently accepting 
transuranic nuclear waste: that contaminated 
by elements heavier than uranium. The Waste 
Isolation Pilot Plant (WIPP), run by the US 
Department of Energy (DOE), is used to dis- 
pose of laboratory equipment, clothing and 
residues from the nation’s nuclear-defence 
programme. In the past 15 years, around 
91,000 cubic metres (equivalent to covering 
a soccer field to a depth of about 13 metres) 


of such transuranic waste, mostly of relatively 
low radiation levels, has been placed there. 

The main contaminants are long-lived iso- 
topes of plutonium (mainly plutonium-239, 
with a half-life of 24,100 years, and pluto- 
nium-240, with a half-life of 6,560 years) 
and shorter-lived isotopes of americium 
and curium. In rooms carved out of a 
250-million-year-old salt bed, the waste is 
stored in hundreds of thousands of plastic- 
lined steel drums. The repository is now 
at about half of its planned capacity and 
is to be sealed in 2033. 


The DOE is responsible for performing 
safety assessments to ensure that WIPP 
will not exceed limits on exposure to 
radioactivity, as set by the US Environmental 
Protection Agency (EPA), for 10,000 years. 

But new demands are emerging. An 
arms-control agreement with Russia made 
in 2000 obliges the United States to dispose 
of 34 tonnes of plutonium from dismantled 
nuclear weapons’. Following the terms of 
the agreement, the United States planned 
to convert the material into a fuel — mixed 
(uranium and plutonium) oxide, or MOX 
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> —to burn in commercial nuclear-power 
plants. But faced with soaring construc- 
tion costs for a MOX fabrication facility at 
the Savannah River Site in South Carolina, 
the DOE has commissioned evaluations of 
alternatives”. 

The most recent report’, published in 
August 2015, recommends burying the weap- 
ons’ plutonium at WIPP. Judging the reposi- 
tory’s performance to have been “successfully 
demonstrated”, the DOE's Red Team expert 
panel proposes that the 34 tonnes of weapons 
plutonium can be added to WIPP once it has 
been diluted to low concentrations compara- 
ble to that of the transuranic waste at WIPP. 

In fact, WIPP’s safety record is mixed. On 
14 February 2014, a burst drum released 
small quantities of plutonium and ameri- 
cium to the surface (with a radioactivity of 
around 100 millicuries, or 3.7 gigabecque- 
rels)*, Airborne radioactive material reached 
the surface through the ventilation system 
and spread 900 metres from the repository’s 
exhaust shaft. Twenty-one workers were 
exposed to low levels of radioactivity, the 
highest dose equivalent to that from a chest 
X-ray. Nine days earlier, smoke from a burn- 
ing truck filled the underground workings 
and shaft, damaging mechanical, electrical 
and ventilation systems. 

The DOE says that such accidents do not 
compromise the long-term performance of 
the repository. We agree that they need not 
— iflessons are learned. Our concern is not 
the events’ severity but that they were unan- 
ticipated. These accidents illustrate how dif- 
ficult it is to predict potential failures of such a 
disposal system over millennia. For example, 
assumptions about the repository’s geochem- 
istry or the likelihood of drilling into it can 
lead to underestimates of the risks. 

Before expanding WIPP’s plutonium 
inventory, the DOE must examine more care- 
fully its safety assesment for performance that 
stretches to 10,000 years and beyond. 


CULTURE OF COMPLACENCY 

The 2014 radioactive leak at WIPP was 
caused by heat from a chemical reaction in a 
drum‘, Plutonium-contaminated nitrate salts, 
a waste product of plutonium purification at 
Los Alamos National Laboratory (LANL) in 
New Mexico, reacted with an organic, wheat- 
based commercial cat litter used as an absor- 
bent for liquid wastes. The heat popped the 
lid. Although sensors detected the released 
radioactivity and diverted exhaust air through 
filters, some radioactive material leaked 
through. WIPP operators sealed the leak in 
the filtration system and sealed off the room 
in which the leak occurred. The breached 
drum remains in the repository. 

Analyses of the accidents’ by the DOE 
have documenteda lack ofa ‘safety culture’ at 
WIPP. The facility’s successful operation for 
15 years had bred complacency. The failures 


were wide-ranging: in safety assessments, 
control of drum contents, installation and 
maintenance of equipment, and preparation 
for an accident. An investigation of the drum- 
packaging procedure, for example, found “no 
evidence that any type of technical evaluation 
occurred” when selecting the organic absor- 
bent material, even though its incompatibil- 
ity with nitrate salts had been raised at LANL 
during waste packaging". 

From a systems-analysis perspective, the 
drum breach was a ‘normal’ accident’ — a 
human mistake that led to a cascade of errors 
and breakdowns, exacerbated by a failure to 
enforce safety pro- 


tocols. Complex “We cannot 
technologies are becertain 

prone to unan- that future 
ticipated failures inhabitants of 
that can progress the area will 
quickly; examples eyen know WIPP 


include the 1979 
Three Mile Island 
nuclear-plant meltdown in Pennsylvania and 
the 1986 Challenger space-shuttle explosion. 
Such accidents cannot be easily predicted, 
but a system designed with failure in mind 
can mitigate the risk. 

The WIPP accident can be taken as a posi- 
tive — it presents an opportunity to learn. The 
DOEhas aggressively identified its causes and 
implemented corrective actions; incompatible 
chemicals are no longer mixed in the drums. 
But once the repository is closed, its contents 
cannot be monitored or problems fixed. We 
cannot be certain that future inhabitants of 
the area will even know that WIPP is there. To 
put the timescales in perspective, agriculture 
was developed just over 10,000 years ago. 


is there.” 


LONG-TERM SAFETY 
WIPP’s present safety assessment addresses 
two scenarios: first, undisturbed performance 
and, second, human intrusion, such as inad- 
vertently drilling through the repository in 
search of oil and gas®. The first foresees that 
after closure, the salt into which the reposi- 
tory is built will deform and flow around 
the drums to encase the waste. The model 
assumes that no fluids, such as brine, are 
present and that the site remains geologically 
isolated. Although the drums will be crushed, 
the radioactive material will be locked in the 
dry, solid salt, with no way to release radioac- 
tivity to the biosphere. Reliance on the geo- 
logical barrier is so great that the form and 
composition of the waste is assumed to be 
unimportant; it need not even be treated. 
Human intrusion could release radio- 
activity to the environment’. Salt deposits, 
layered as sediments or as salt domes, are 
often associated with mineral and energy 
resources, such as potash and hydrocarbons 
— oil and gas. In southeastern New Mexico, 
exploration for and extraction of these fuels 
has led to extensive drilling in the Permian 
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Basin, where WIPP is located. 

The probability of a borehole piercing 
the repository in the next 10,000 years is 
significant. Ifa borehole were to puncture 
the repository and a brine pocket, which 
are known to exist in the Castile geological 
formation below the Salado salt formation 
in which the repository sits, fluid may reach 
the transuranic waste (see ‘Accident risk’). 
To assess the risk of radioactive release, one 
must first establish the probability of bore- 
hole penetration and determine how the 
pressurized brine will react with the waste. 

In forecasting future drilling rates, the EPA 
has used a 100-year historical average rate for 
the region, which predicts 67.3 boreholes per 
square kilometre over the 10,000-year regu- 
lated period®. But drilling near WIPP has 
risen sharply in recent years. As horizontal 
drilling and hydraulic-fracturing techniques 
have made new areas of hydrocarbon-bear- 
ing rocks accessible, the Permian Basin has 
become the most prolific oil-producing area 
in the United States. A recent 10-year histori- 
cal average (2002-12) yields 148 boreholes 
per square kilometre over 10,000 years, more 
than doubling the projected risk of repository 
intrusion. Drilling rates, the effects of new 
technologies, and supply and demand pres- 
sures on hydrocarbon production are difficult 
if not impossible to predict centuries ahead. 

The concentration of transuranic elements 
leached by intruding brine is also hard to esti- 
mate because of the complexity of the waste: a 
typical drum contains a variety of materials, 
such as lab coats, gloves and other laboratory 
equipment. Different micro-geochemical 
environments will develop around different 
waste types. Chemically organic materials, 
such as plastic bags, may degrade by micro- 
bial action and generate carbon dioxide. 
In brine, CO, forms stable carbonate and 
bicarbonate complexes with plutonium and 
other actinides (elements 89-103), raising 
their concentrations in solution. Large bags 
of magnesium oxide powder, amounting to 
more than 31,000 tonnes, have been placed 
in WIPP disposal rooms as an ‘engineered 
barrier. The magnesium oxide should react 
with the CO, to form stable magnesium car- 
bonates, thereby removing CO, from solution 
and reducing the solubility of actinides. This 
presumes that the reactions proceed to com- 
pletion and all the CO, is consumed. 

The safety analysis calculations for WIPP 
assume that there is no CO, present, dra- 
matically lowering actinide concentrations 
in the brine and thus the risk of release of 
radioactivity. But reliance on magnesium 
oxide and a series of idealized reactions to 
constrain the repository’s geochemistry is 
problematic, particularly if the amount of 
plutonium stored at WIPP increases. As 
made clear by the 2014 accidents, complex 
interactions of materials must be carefully 
considered when predicting the repository’s 
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ACCIDENT RISK 


Thousands of years in the future, inadvertently drilling a borehole through the Waste Isolation Pilot 
Plant, a nuclear-waste repository, and into a brine pocket could release radioactive material into the 
environment. The brine would interact with the waste and contaminated fluid could reach the surface 
through the borehole or shaft and spread within permeable rocks. 


Rustler formation 
and overlying units 
(mixed shale, sandstone 
and salts) 


1. Borehole for exploratory 
drilling inadvertently 
penetrates repository and 
punctures brine pocket. 


Layers of 
potash, clay 
and anhydrite 


Castile formation 
(anhydrite and brine) 


performance now and in the future. 

The Red Team report proposes diluting 
the weapons plutonium before its disposal in 
an “inert adulterant” — a classified mixture 
of cementing, gelling, thickening and foam- 
ing agents known as stardust. The report is 
unclear on what is meant by ‘inert’; however, 
inert materials are rare, particularly those 
that must remain so for thousands of years. 


PLUTONIUM DISPOSAL 

In the case of plutonium-bearing solids, 
demonstrating chemical inertness presents 
a huge challenge. In near-surface conditions, 
plutonium can assume a variety of oxida- 
tion states — up to four, each with different 
solid-state and geochemical behaviours’. Its 
decay product uranium-235 has two princi- 
pal oxidation states, U* and U*, each with 
different geochemical mobility’. This com- 
plexity makes it difficult to predict how the 
actinides will react or be transported. 

Also, actinides decay mainly by the emis- 
sion of a particles (energetic helium nuclei). 
During each decay, the daughter nucleus 
recoils and displaces thousands of atoms in 
the surrounding solid. Over time, this dam- 
age accumulates and changes the properties 
and chemical stability of the material. Radia- 
tion effects in actinide-bearing materials 
have been well documented over the past 
20 years®, but are not considered in the Red 


dolomite aquifer 


Salado formation 
(beds of rock salt) 


Waste disposal region 


Repository 


3. Radioactive fluid 
spreads through the 
borehole, shaft or 
rock layers. 


2. Brine reacts with 
radioactive waste and 
mobilizes plutonium. 


Repository and rig not to scale 


Team's evaluation. 

The ‘dilute-and-dispose’ proposal to 
convert weapons-plutonium pits to pluto- 
nium oxide for burial in WIPP’ immediately 
raises safety issues. The extra plutonium 
nearly triples the current projected plutonium 
(around 12 tonnes) at closure. The design 
and safety assessment did not envision such 
alarge amount. WIPP’s capacity would have 
to expand by 15%, increasing the likelihood 
that a borehole will one day intersect it. 

And the changed inventory of actinides 
demands new assessments of interactions 
with the materials present, including brine 
and CO,. The amount of plutonium mobi- 
lized in brine depends on its solubility, 
which depends on its form and the amount 


of CO, present after reaction with the bags of 


magnesium oxide. 


NEXT STEPS 
The current regulatory period of 10,000 years 


is short relative to the 24,100-year half-life of 


plutonium-2339, let alone that of uranium-235, 
which has a half-life of 700 million years. To 
accommodate the extra plutonium, the regu- 
latory period might be lengthened, meaning 
that the probability of human intrusion dur- 
ing this period increases. 

Some of these issues and others were raised 
in two 2015 reviews””° of the Red Team report 


by the consultancy High Bridge Associates of 


Greensboro, Georgia. But the analysis did not 
consider the possibility of human intrusion. 

WIPP is fulfilling an important national 
need — the disposal of legacy transuranic 
waste from US defence programmes. Its 
opening was the culmination of 20 years of 
scientific research, engineering design and 
public engagement. Despite the accidents, 
WIPP can still fulfil its mission. 

However, proposals to substantially 
increase the plutonium inventory combined 
with a failure to revise the safety assessment, 
particularly the possibility of human intru- 
sion, bear witness to the ease with which pol- 
icy decisions can disregard the fundamental 
science — and risk yet another failure. 

The Red Team report shows a limited 
effort to consider or manage inherent risks. 
The shortcomings of proposals to dispose 
of weapons plutonium at WIPP mirror 
the operational failings that led to the 2014 
accidents. Before the DOE considers imple- 
menting these recommendations, it should 
look to the repository’s record over the past 
15 years of operation and reassess its confi- 
dence in the safe performance of the facility 
over the next 10,000. = 
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Harrison Dyar collected 500,000 different kinds of mosquitoes over his career. 


ENTOMOLOGY 


A life of insects 
and ire 


Professional feuds and private oddities abound ina 
biography of Harrison Dyar, finds William Foster. 


arrison G. Dyar Jr (1866-1929) 
H:: an influential US biologist 

who became notorious in his own 
lifetime. His pioneering work changed our 
understanding of the biology and system- 
atics of two globally important groups: the 
Lepidoptera (butterflies and moths) and 
mosquitoes. But his scientific legacy is over- 
shadowed by his protracted, spectacularly 


belligerent feuds with fellow entomologists, 
and scandalous revelations about his private 
life. For some 14 years, he was married to 
two women, maintaining two families of five 
children in all. Towards the end of his life, 
he built an extensive system of brick-lined 
tunnels deep under the heart of Washing- 
ton DC. When accidentally discovered in 
1924, the tunnels caused such a sensation 
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Moths, Myths, and 
Mosquitoes: The 
Eccentric Life of 
Harrison Dyar 
MARC EPSTEIN 
Oxford University 
Press: 2016. 


that they featured in 
cartoons in that year’s 
presidential cam- 
paign. 

Insect taxonomist 
Marc Epstein has a 
long association with 
the entomology department of the Smithso- 
nian Institution in Washington DC, where 
Dyar spent most of his professional career. To 
write Moths, Myths, and Mosquitoes, Epstein 
dug deep into the Smithsonian's holdings 
of publications, correspondence, diaries, 
unpublished short stories and novellas, 
newspaper articles and marriage certificates, 
providing rich context for Dyar’s intellectual 
and scientific milieu. Epstein’s descriptions 
of Dyar’s collecting trips and battles with col- 
leagues are particularly evocative. 

Dyar’s major scientific achievement was to 
establish the use of insects’ immature stages 
in the construction of evolutionary trees, or 
phylogenies, based on Darwinian principles. 
His 1902 list of North American Lepidoptera 
has stood the test of time, as have most of 
his species classifications among the Lepi- 
doptera, sawflies and mosquitoes. Among 
biologists he is perhaps best known for for- 
mulating Dyar’s Law, which postulates that 
the width of the head capsule of each devel- 
opmental stage, or instar, of a caterpillar is 
related by a constant ratio to the width of the 
head capsule of the next instar. When looking 
at a sample, this takes most of the guesswork 
out of deciding which instar you are deal- 
ing with, whether any are missing and how 
many a particular species might have. Epstein 
shows that Dyar fudged his data a little to fit 
his model, but the model itself remains use- 
ful. (It certainly worked for me when I did 
my first biology project, a demonstration of 
Dyar’s Law, asa schoolboy in the mid-1960s.) 

It is relatively difficult to establish objec- 
tive measures of accuracy in systematics and 
phylogeny; as a result, researchers are often 
tempted to settle arguments by sheer force of 
personality. Dyar embraced this temptation 
with rash enthusiasm, even though his own 
research had enhanced the objectivity of 
the field. He seemed to relish long-running 
scientific battles. In 1905, he accused Henry 
Skinner, editor of Entomological News, of 
indulging in a “hysterical outburst’, adding, 
“Better take a sedative.’ Skinner responded 
that Dyar drank “too much ice water’, so 
his blood was “too frigid” to understand 
what went on in the News. In 1908, mos- 
quito expert Evelyn Mitchell sued Dyar for 
libel. The case was dismissed, and Dyar 
got his revenge by writing an unpublished 
short story, ‘The 
Taming of a Suffra- 
gette, which features 
a thinly disguised 
Mitchell. As Moths, 
Myths, and Mosquitoes 
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progresses, one begins to sense that the bat- 
tles are what kept Dyar going. “Lest affairs 
become too monotonous, I feel obliged to 
start something occasionally,” he wrote in 
1925, ina letter to Leland Howard, his supe- 
rior at what is now the US National Museum 
of Natural History. 

Much of the book, in particular the sec- 
tions on Dyar’s private life, feels like the raw 
materials of a biography rather than a fin- 
ished product. Epstein fields a wonderful set 
of characters, but struggles to breathe life into 
them; many passages are cluttered with detail. 
Nor is it always easy to work out the sequence 
of events, but the many excellent illustrations 
and photographs help to shape and colour 
the narrative. There is a particularly moving 
photo of Dyar and his two-year-old son Otis 
ona beach, staring rather gloomily into the 
camera. Next to them is a huge sand sculpture 
of a spiny limacodid caterpillar: even at the 
seaside, there is no escape from Dad’s work. 

In the epilogue, Epstein offers definitive 
proof of when Dyar’s bigamy began. Dyar 
married Zella Peabody in 1889, fathering 
two children with her before their divorce in 
1920. Yet in 1906, Dyar had secretly married 
Wellesca Pollock under the name of Wilfred 
P. Allen. He had three children with Pollock; 

surreally, she later 


“He constructed _ tried, and failed, 


alabyrinth of to obtain a divorce 
tunnels lined from the imaginary 
withbricksand — Allen.In 1917, Dyar 
furnished with was exposed for the 
electric lights.” deception and dis- 


missed from gov- 
ernment service. 

However appalling Dyar’s behaviour 
to relatives and colleagues (he described 
Smithsonian librarian Mathilde Carpenter 
as a “boisterous, screaming plebian” in a let- 
ter to her supervisor), there is something 
oddly admirable about his final obsession. 
While digging a hollyhock bed for Peabody, 
he found himself 2 metres down, and was 
seized with the urge to keep going. He con- 
structed a labyrinth of tunnels lined with 
bricks and furnished with electric lights, 
sculpted animal heads and mottos from 
Virgil. He claimed that they provided him 
with exercise. And no biologist could fail to 
admire Dyar’s unquenchable respect for the 
insects that he studied. When the US gov- 
ernment set out to exterminate mosquitoes 
in Yosemite National Park, California, Dyar 
protested, saying that a few bites are “good 
for hikers and lend zest to the fisherman's 
waiting” = 
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Books in brief 


The Restless Clock: A History of the Centuries-Long Argument 
Over What Makes Living Things Tick 

Jessica Riskin UNIVERSITY OF CHICAGO PRESS (2016) 

At the heart of this scientific and cultural history is the concept of 
agency — the capacity to act — in nature. Jessica Riskin reveals how 
two distinct interpretations emerged from the mechanical Universe 
of the Enlightenment: Isaac Newton’s passive version, reliant on a 
divine tinkerer; and Gottfried Leibniz’s, which saw life as purposeful 
and “self-transforming”. Riskin’s investigation of this duality, by way 
of Renaissance automatons, the gestation of evolutionary theory 
and quantum mechanics, is engrossing and illuminating. 


Herding Hemingway’s Cats: Understanding How Our Genes Work 
Kat Arney BLOOMSBURY SIGMA (2016) 

In this witty, clued-up report from the front lines of genetics, science 
communicator and broadcaster Kat Arney unravels the intricacies of 
the discipline with a romp through ‘thumbed’ cats, hipped fish and 
frank interviews with scientists such as evolutionary biologist Dan 
Graur. As she synthesizes key findings, she deploys a host of droll, 
yet apt, metaphors (including the human genome as a grim cable- 
television channel featuring tedious repeats), and pulls no punches 
in laying out the vast gaps in our understanding and the rancorous 
debates within the field. 
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City of Thorns: Nine Lives in the World’s Largest Refugee Camp 
Ben Rawlence PICADOR (2016) 

Dadaab in the Kenyan desert is the world’s largest refugee camp, a 
last-ditch home to some half a million people fleeing violence in the 
Horn of Africa. In this trenchant, densely layered sociopolitical study, 
investigative journalist Ben Rawlence reveals Dadaab’s complexities 
through the lives of nine residents, impossibly courageous survivors 
of derailed cultures, imploded cities and sundered families. A 
reminder that although there are thousands of refugees at Europe’s 
borders, millions more languish in camps — as Rawlence puts it, 
between “impossible dreams and a nightmarish reality”. 


The Lucky Years: How to Thrive in the Brave New World of Health 
David B. Agus SIMON & SCHUSTER (2016) 

Oncologist and biomedical researcher David Agus’s bestselling 

The End of Illness (Simon & Schuster, 2012; see Nature 480, 

177; 2011) argued persuasively for personalized health care. In 
this clear-cut follow-up, he details health interventions including 
monitoring technologies, analysable aggregate data sets and, more 
controversially, smartphone apps that detect signs of depression. 
What is strongest here is Agus’s deft marshalling of research old and 
new, and his common-sense guidance on preventives such as sleep 
hygiene and the optimal level of exercise (450 minutes per week). 
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The Sacred Combe: A Search for Humanity’s Heartland 

Simon Barnes BLOOMSBURY NATURAL HISTORY (2016) 

The momenta herd of elephants ripped into his thatched hut did it 
for natural-history writer Simon Barnes: he suddenly realized that 
Zambia’s Luangwa Valley had claimed him for its own. This episodic 
journey into the wilds of Devonshire, Africa and memory — the edenic 
spaces where species fleetingly coexist — is studded with descriptive 
jewels. Here, for instance, are eland antelopes, “one-tonners drifting 
back like pale wisps of smoke”, and an otter with “elegant bum briefly 
sky-pointing” as it dives in for the hunt. Barbara Kiser 
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GERONTOLOGY 


Extending the healthspan 


Linda Partridge examines studies on preventative medicine for the ageing. 


r Vhe chances of living to old age are 
higher than ever in many parts 
of the world. So, particularly in 

developed countries, health-care sys- 
tems are struggling to cope with the ‘silver 
tsunami of elderly people with clusters 
of diseases for which age is the main 
risk factor, including cancer, diabetes, 
cardiovascular disease, sarcopenia and 
dementia. Fortunately, the opportunity 
is at hand to transform the landscape and 
keep people in better health as they age. 

Aging is a collection of articles edited 
by gerontologists Jay Olshansky, George 
Martin and James Kirkland. It explores the 
potential to extend human health by draw- 
ing on discoveries about the biology of age- 
ing. The overall coverage is US-centric, 
and would have benefited from more 
cutting-edge basic science and human 
demography from the rest of the world. 
But the message is clear: it is time to begin 
the revolution in medical approaches to 
ageing-related disease and late-life health. 

In lab animals, Aging tells us, simple 
genetic and environmental interventions 
can increase healthy lifespan substantially. 
A restricted diet can protect ageing rodents 
and rhesus monkeys from most impair- 
ments and diseases. Genetic alterations 
to the signalling networks that sense and 
respond to nutrients and to other inputs 
can have similar effects. These interven- 
tions tamp down the changes that char- 
acterize ageing and that lead to pathology, 
including chronic inflammation, cellular 
senescence, damage to macromolecules and 
decline in stem-cell function. 

An emerging consensus thus regards 
ageing as composed of modifiable sub- 
syndromes. These could be targeted with 
drug combinations and environmental 
changes to produce a broad-spectrum, pre- 
ventive medicine for multiple ageing-related 
diseases. Currently, most treatment is directed 
towards individual diseases as they arise. 

Development of new drugs to target 
ageing is a major challenge, thoughtfully 
addressed in the chapter by James Kirkland. 
Ageing is not recognized by regulatory 
authorities as a disease, or as a valid target of 
clinical trials, which would be prohibitively 
expensive because they would need to be of 
long duration, and to use initially disease- 
free participants. It would be more feasible 
to test drugs that have been approved for 
specific diseases, and that also target mech- 
anisms of ageing, against other age-related 


Woody, age 83. 


conditions. For instance, older people tend 
to have a poor immune response to immu- 
nization against influenza; that response 
can be improved by pre-treating them 
with sirolimus, a drug already shown to 
increase animal lifes- 
pan and licensed as an 
immunosuppressant 
for use after kidney 
transplants. Polymor- 
bidity (the presence of 
multiple conditions at 
once) has not yet been 
assessed as a potential 
drug target, and would 
require new, combina- 
torial measures of out- 
come and trials with 
older people, who are 
currently generally 
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different axes of ageing may give maxi- 
mum protection, posing further interest- 
ing challenges for trials. 

Olshansky discusses the ethics of tar- 
geting ageing for disease prevention. 
Ageing, he shows, is not nature’s way of 
making space for the young; rather, it is a 
haphazard process of decline. The discov- 
ery that it is malleable allows an entirely 
new approach to improving the health 
and welfare of older people. Lifespan 
might increase slightly as a result, but the 
crucial point is that morbidity could be 
compressed. Olshansky also shows how 
the conventional route of treating single 
diseases could, perversely, increase overall 
morbidity, because the longer people live, 
the greater the part played by ageing in 
health status. Interfering in mechanisms 
of ageing provides the current best pros- 
pect for preventing cancer, cardiovascular 
disease and dementia. 

From a health-economics perspective, 
examined by Dana Goldman, the results of 
reducing morbidity in older people would 
depend on public policy. If older people 
were healthier and more active, then they 
would depend less on others and would 
produce greater economic activity — from 
work and volunteering — both of which 
would be a net benefit to society. A direct 
comparison of the predicted economic 
consequences of the status quo, delayed 
cancer, delayed heart disease and delayed 
ageing showed that this last scenario would 
result in a much higher proportion of able 
older people. However, for that to create an 
economic benefit, age-related entitlements 
to various forms of support, particularly pen- 
sions, would have to change. 

Basic science and human demographic 
studies have delivered an unprecedented 
opportunity to tackle the comorbidities of 
later life. To translate these discoveries into 
drugs and changes in medical practice, and 
to reap the consequent economic benefits, 
will require some radical changes: breaking 
down disease siloes, training a new genera- 
tion of physicians and scientists capable of 
working across disciplinary boundaries, and 
altering public attitudes and policy. = 
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Refugee fences 
fragment wildlife 


Erecting border fences in parts of 
Europe in response to the current 
massive influx of refugees may 
harm wildlife. The fences can kill 
animals by entangling them in 
razor wire and will jeopardize the 
hard-won connectivity of species 
populations. 

The human toll of the refugee 
crisis deserves the highest 
political attention. At the same 
time, many of the fences could 
be in violation of commitments 
under international conservation 
agreements, such as the European 
Commission's Habitats Directive. 

With the opening of political 
borders during the twentieth 
century, Europe's large fauna 
have rebounded. This success 
is a result of trans-boundary 
conservation projects backed 
by legislation and effective 
management. 

However, refugee fences have 
proliferated along the borders of 
Slovenia, Croatia and Hungary, 
for example, and more are 
planned along the boundaries of 
Latvia and Estonia with Russia. 
These are likely to affect brown 
bear, wolf, lynx and red deer 
species. 

Mitigation measures should 
include adapting national 
conservation-management 
schemes to ensure the survival 
of newly isolated animal 
populations; designing the 
structure and placement of 
fences to minimize their impact 
on wildlife; and removing the 
fences at the earliest opportunity. 
John D. C. Linnell* Norwegian 
Institute for Nature Research, 
Trondheim, Norway. 
john.linnell@nina.no 
*On behalf of 4 correspondents (see 
go.nature.com/fm6aaa for full list). 


Treat wasting illness 
on multiple fronts 


Cachexia is a complex wasting 
syndrome that cannot be fully 
reversed by nutritional support 
alone (see, for example, Nature 


528, 182-183; 2015). There is 
accumulating evidence that a 
comprehensive multimodal 
approach may succeed where 
unimodal treatments (such as 
nutrition or anabolic drugs) 
have failed to deliver extended 
clinical benefits. Support for 

a multimodal policy comes 
from established rehabilitation 
programmes (see, for example, 
M.A. Spruit et al. Am. J. 

Respir. Crit. Care Med. 188, 
e13-e64; 2013). 

The failure of the classical 
unimodal approach suggests 
that a shift in clinical-trial design 
is needed (K. C.H. Fearon 
et al. J. Cachexia Sarcopenia 
Muscle 6, 272-274; 2015). This 
could include monitoring the 
combined effects of exercise 
and nutrition, along with 
controlling metabolism and 
systemic inflammation. These 
interventions would need to 
be tested early, before cachexia 
becomes irreversible. 

The complexity of such 
interventions makes them 
difficult to organize and fund. 
They would require input 
from research, government, 
pharmaceutical companies and 
regulatory authorities. However, 
the possible clinical benefits 
stand to improve the quality 
and, in the long term, perhaps 
even the quantity of patients’ 
lives. 

Kenneth Fearon University of 
Edinburgh, UK. 
k.fearon@ed.ac.uk 
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EU conservation 
overlooks geology 


The European Commission 
needs to expand its conservation 
policy to protect its seriously 
threatened geological heritage. 
Legislation for nature 
conservation in the European 
Union has so far focused mainly 
on biodiversity and habitats 
(see, for example, V. Hermoso 
Nature 528, 193; 2015). But 
fossils, rocks, minerals and 
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landforms also contribute toa 
country’s geological landscape 
and heritage. Their features 
are a scientific asset that is 
shared by all countries, as well 
as an educational and cultural 
resource. They are also essential 
for supporting services to 
biodiversity. For example, 
geological sites in coastal 
cliffs and rocky outposts 
harbour and protect huge 
varieties of sea birds. 

Neither of the two EU 
conservation directives that 
are currently under review 
(see go.nature.com/vkm9r7) 
includes the non-living elements 
of natural heritage, making 
it hard to encourage public 
respect for important geological 
features. 
José Brilha European Association 
for the Conservation of Geological 
Heritage (ProGEO); and 
University of Minho, Braga, 
Portugal. 
jbrilha@dct.uminho.pt 


Plans for European 
medical doctorate 


I agree with Stefan Hardt and 
colleagues on the benefits of 
a unified European medical 
doctorate (Nature 528, 333; 
2015). However, removing the 
research component to create a 
vocational degree could result in 
a shortage of clinician scientists. 

This is evident from our 
(unpublished) 2014 survey of 
1,069 supervisors of dissertations 
at Charité in Berlin, one of 
Europe’s largest university 
hospitals. Just under 1% of 
3,714 research projects were 
of an MD-with-PhD type, 
more than two-thirds were 
MD projects and the rest were 
straight PhDs and dental or 
nursing projects. Thus, shifting 
to a purely vocational European 
medical doctorate system would 
mean many physicians missing 
out on useful research training 
(see also D. M. Milewicz et al. 
J. Clin. Invest. 125, 3742-3747; 
2015). 

One solution might be to 


integrate the European medical 
doctorate into the medical 
curriculum. This would also 
encourage more-consistent 
application of evidence-based 
medicine in daily practice 
throughout the European Union 
(see J. Hilgers et al. Med. Teach. 
29, 270-275; 2007). Formal 
training of lecturers, tutors 

and supervisors responsible 

for this integration would help 
to standardize and improve 

the quality of dissertation 
supervision (see Nature 527, 7; 
2015). 

Marc Dewey Charité — University 
Medicine Berlin, Germany. 
dewey@charite.de 


Monitor safety of 
aged fuel pipelines 


Ensuring the integrity and safety 
of old pipelines that transport oil 
and natural gas calls for frequent 
inspections, together with 
modern, sensitive leak-detection 
tools and regular removal of 
accumulated deposits. 

The failure of old pipelines is 
becoming increasingly common, 
and can be dangerously 
disruptive to communities and 
the environment. Examples 
from the United States include 
the rupture in 2010 of a 41-year- 
old oil pipeline in Michigan, 
which spilled around 4.5 million 
litres of oil into the Kalamazoo 
River, and the 2013 failure of a 
65-year-old pipeline in Arkansas, 
requiring 22 homes to be 
evacuated. 

Modern pipelines built from 
high-quality steels are statistically 
safer than transporting such 
fuels by road or rail (J. Behar and 
S. Al-Azem World Pipelines 15 
(4), 18-28; 2015). However, over 
half of US underground pipelines 
are more than 50 years old (see 
go.nature.com/gczd6b). Such 
aged pipelines could fail at any 
time from corrosion, cracking or 
coating deterioration (X. Liet al. 
Nature 527, 441-442; 2015). 
Frank Cheng University of 
Calgary, Alberta, Canada. 
fcheng@ucalgary.ca 
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Fibre for the future 


Achronic lack of dietary fibre has been found to reduce the diversity of bacteria in the guts of mice. This effect is not fully 
reversed when fibre is reintroduced, and increases in severity over multiple generations. SEE LETTER P.212 


ERIC C. MARTENS 


eople living in industrialized nations 

routinely consume much less than the 

recommended amount of 25-38 grams 
of dietary fibre per day. Physicians and nutri- 
tionists have been imploring us for decades 
to bolster our fibre intake to help stave off 
maladies ranging from heart disease to intes- 
tinal disorders. The mechanisms through 
which fibre consumption modulates health 
are manifold, including a role in maintaining 
our resident gut microorganisms. On page 212 
of this issue, Sonnenburg et al.' reveal that a 
lack of dietary fibre leads to a substantial loss 
of diversity in this microbial community, and 
influences the ability of gut bacteria to be 
transferred from parents to their offspring. 
Furthermore, it seems that simply restoring 
fibre consumption is not enough to reverse this 
effect once it has been passed to subsequent 
generations. 

The ‘fibre’ that we see quantified on food 
labels is a catch-all category encompassing 
dozens of different molecules, mostly complex 
carbohydrates (linear and branched chains of 
simple sugars such as glucose). But the human 
genome encodes only around a dozen digestive 
enzymes that target complex carbohydrates. 
Technically speaking, dietary fibre comprises 
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the polymeric molecules that cannot be broken 
down by these enzymes. However, these nutri- 
ents do not go to waste. Instead, the diverse 
microorganisms that have evolved to inhabit 
the human intestine — collectively called 
the gut microbiota — produce thousands of 
enzymes that specifically target dietary fibre™™*. 
Some individual bacteria produce more than 
300 such enzymes’. These organisms ferment 
the released sugars into short-chain fatty acids, 
which are used as fuel for intestinal cells 
and which influence systemic physiology and 
the development of immune responses’. 

The gut microbiota of each person typically 
contains hundreds of different bacterial 
species. We do not each harbour exactly the 
same community members; rather, the com- 
position of our microbiota is drawn from a 
larger set of potential colonizers on the basis 
of parental and environmental exposure that 
begins at birth. Many microorganisms that live 
in the human gut exist only in this niche, and 
thus rely on successful transfer between gen- 
erations to avoid extinction. 

Sonnenburg et al. posed the question: what 
happens to the microbiota when dietary fibre is 
withheld for prolonged periods? The research- 
ers colonized the intestines of germ-free mice 
(those that lack any resident microorganisms) 
with a human faecal sample, which contains a 
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Most reduced diversity 


representative complement of the gut micro- 
biota members. They then fed the mice a diet 
rich in dietary fibre or one that contained only 
low fibre, in a form poorly accessible to the 
microbiota. After several weeks of fibre depri- 
vation, the microbiota showed a reduction in 
the abundance of many bacterial groups that 
had been previously present (Fig. 1). These 
bacteria continued to thrive in the mice that 
were fed a high-fibre diet. When the fibre- 
starved mice were returned to a normal diet 
and allowed to recover for several weeks, many 
of these groups came back, but some failed to 
return to their previous levels, revealing that 
prolonged diet shifts can inflict changes that 
persist after dietary intervention. 

The authors next investigated how fibre 
consumption affects the microbiota over 
multiple generations. They allowed the mice 
colonized with human bacteria, from both the 
high- and low-fibre cohorts, to breed within 
their cohorts, and for natural microbial coloni- 
zation of the offspring to occur through mater- 
nal contact. Offspring born to parents fed the 
low-fibre diet had reduced microbiota diver- 
sity irrespective of whether they were weaned 
onto the same diet as their parents or onto a 
high-fibre diet. Strikingly, the reduction in 
gut bacterial diversity that was observed in the 
first generation was compounded over each 
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Figure 1 | Loss of diversity. Sonnenburg et al.' found that mice fed a low-fibre diet had a lower species diversity in their gut microbiota than mice fed a 
high-fibre diet. In first-generation mice, most (but not all) of this diversity was recoverable when mice on the low-fibre diet were switched to a high-fibre diet. 
However, the authors found that diversity loss was greater in each subsequent generation maintained on a low-fibre diet, and that the degree of recovery also 
decreased, implying extinction of some microbial species. 
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of four subsequent generations. Moreover, 
the inferred genomic content of the bacteria 
that remained after four generations suggested 
that the abundance of several fibre-degrading 
enzyme families had been reduced. But further 
work is required to find out whether a loss in 
fibre-degrading capacity occurred. 

To assess whether dietary change might 
ameliorate these deficiencies, Sonnenburg 
et al. placed some of the mice from each gen- 
eration of fibre-deprived mice on a high-fibre 
diet. The inability to recover lost diversity was 
a consistent characteristic at each generation 
(Fig. 1). However, transplanting the fibre- 
starved mice with a faecal sample from mice 
fed a high-fibre diet successfully restored most 
of the missing bacteria. 

It is becoming increasingly apparent that 
the gut microbiota of people in cultures that 
eat less-processed and higher-fibre diets differ 
from those of people in industrialized coun- 
tries, and often contain a higher diversity of 
microorganisms’ ’. Humans have co-evolved 
with symbiotic bacteria, and these microbial 
partners shoulder most of the burden of digest- 
ing complex carbohydrates. It remains to be 
determined whether some of this functionality 
has already been lost in some people and, if 
so, to what extent. However, in the future, we 
may turn to probiotic formulations, possibly 
derived from humans or animals that have not 
yet restricted their gut microbiome through a 
low-fibre diet, to restore essential functions 
that have been lost. 

Carbohydrates frequently get a bad rap in 
fad diets, largely owing to simple carbohydrates 
such as glucose and fructose that permeate 
Western diets and provide us with an excess of 
easy calories. However, their complex cousins 
that are naturally present in plants, whole 
grains and a variety of other sources are worth 
consuming in greater amounts. Two authors 
of this study last year published a book for 
the popular press, The Good Gut", which 
chronicles the interaction between diet, the 
microbiome and health, and is replete with 
high-fibre recipes. You just might consider 
choosing a salad at lunch today or an extra 
serving of beans at dinner. Future generations 
may thank you, too. = 
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Photons from dwarf 
galaxy zap hydrogen 


The detection of photons sufficiently energetic to ionize neutral hydrogen, coming 
from a compact, star-forming galaxy, offers clues to how the first generation of 
galaxies may have reionized hydrogen gas in the early Universe. SEE LETTER P.178 


DAWN K. ERB 


ost of the ordinary matter in the 
M Universe is found not in stars, but in 

the diffuse gas between galaxies: the 
intergalactic medium (IGM), which is mainly 
hydrogen. This gas is almost completely ion- 
ized, and has been so since the formation of 
the first stars and galaxies a few hundred mil- 
lion years after the Big Bang. But few details are 
known about the sources of the radiation that 
ionized the gas, or how this radiation escaped 
from its source galaxies. On page 178 of this 
issue, Izotov et al.' report the detection of ion- 
izing radiation from a star-forming dwarf gal- 
axy in the local Universe, which may clarify the 
escape question. 

The hydrogen gas that pervades the Universe 
has undergone phase changes over 13.8 bil- 
lion years. The early Universe was too hot for 
protons and electrons to combine into neutral 
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hydrogen 
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hydrogen, so the hydrogen was ionized. The 
Universe cooled as it expanded, and about 
375,000 years after the Big Bang, the tempera- 
ture decreased enough for neutral hydrogen 
to form. 

The gas remained neutral for the next few 
hundred million years, until the epoch of 
reionization — the last major phase transi- 
tion in the Universe (Fig. 1). This transition 
occurred when the first sources of photons 
that were energetic enough to ionize hydro- 
gen appeared in sufficient numbers to reionize 
the IGM. These photons are known as Lyman 
continuum photons because they have 
wavelengths shorter than the Lyman limit 
of 912 angstréms, which corresponds to the 
energy required to ionize the hydrogen atom 
(13.6 electronvolts). 

We now know, from the scattering of cosmic 
microwave background photons by reionized 
electrons’ and from observations of the 
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Figure 1 | Escape of ionizing radiation from a galaxy. The hydrogen gas between galaxies was ionized 
by the first stars and galaxies when the Universe was about 400 million years old. This probably occurred 
gradually (main panel): radiation from stars and galaxies ionized increasingly large bubbles of gas, until 
the ionized regions completely overlapped. For this to happen, photons with enough energy to ionize 
hydrogen must escape from galaxies. Such escape is challenging, because galaxies are filled with 

neutral hydrogen gas that absorbs ionizing radiation (inset; arrows represent ionizing photons). 

Izotov et al.' report the detection of ionizing photons from a compact starburst galaxy in the nearby 
Universe, a discovery that helps to explain the conditions that allow such radiation to escape from galaxies. 
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absorption by neutral hydrogen in the spectra 
of extremely distant quasars” , that reionization 
was a gradual process, the midpoint of which 
occurred approximately 400 million years 
after the Big Bang. However, we have not yet 
observed the changing ionization state of the 
IGM directly, and theoretical models struggle 
to explain how the known population of galax- 
ies at this epoch could have produced enough 
radiation for reionization to occur. 

The problem is twofold: large numbers of 
faint galaxies seem to be required to supply the 
necessary radiation’, but photons must also be 
able to escape from the galaxies in which they 
are produced. Stars form out of cool gas, and 
thus star-forming galaxies are filled with neu- 
tral hydrogen. Neutral hydrogen absorbs ion- 
izing photons and therefore, even ina galaxy 
whose stars produce copious amounts of ener- 
getic radiation, the proportion of this radia- 
tion that actually escapes from the galaxy (the 
escape fraction) may be extremely low. 

The Lyman continuum radiation from the 
galaxies that reionized the Universe will never 
reach our telescopes, because the photons are 
absorbed by the IGM long before they get to 
Earth. It is possible to detect ionizing radia- 
tion from closer galaxies, those correspond- 
ing to an age of the Universe of approximately 
1.6 billion to 2 billion years (see refs 5 and 6, 
for example), but such detections are com- 
plicated by the probability of contamina- 
tion by non-ionizing radiation originating 
in faint galaxies along the line of sight. In some 
ways, galaxies in the local Universe offer the 
best prospect for a detailed understanding of 
how ionizing radiation escapes from galaxies. 
Complicating factors are the unknown simi- 
larities and differences between local galaxies 
and the galaxies of the reionization era, and the 
fact that Earth's atmosphere blocks the ultra- 
violet wavelengths of this radiation, requiring 
that observations be made from space. 

Izotov et al. focused on the compact dwarf 
galaxy J0925 +1403 — deemed likely to pro- 
duce escaping ionizing radiation on the basis 
of properties inferred from its optical emis- 
sion lines. These lines indicate an unusu- 
ally high ionization state in the gas near the 
galaxy’s star-forming regions, suggesting that 
the stars may make more ionizing radiation 
than can be absorbed by the surrounding gas. 
The authors’ successful detection of ionizing 
photons is the fourth such observation from a 
nearby galaxy’. 

Crucially, this galaxy has the highest escape 
fraction yet measured locally: about 8%, com- 
pared with the roughly 1-3% escape fraction 
measured from other nearby galaxies’’. The 
total amount of escaped radiation is sufficient 
to ionize a mass of IGM gas 40 times greater 
than the galaxy’s stellar mass. Finding Lyman 
continuum radiation from this galaxy there- 
fore broadly confirms our understanding of 
the general conditions that may facilitate the 
escape of ionizing radiation. 


However, much work remains to be done 
to understand how galaxies reionized the 
Universe: the current study involves a single 
galaxy, whereas reionization depends on the 
properties of a population. It is not yet clear 
whether or not J0925 + 1403 is typical of com- 
pact, highly ionized starbursts (galaxies with 
extremely high rates of star formation) in the 
nearby Universe. We also do not know whether 
this galaxy is similar to those that reionized the 
Universe; its small size, high ionization state 
and relatively low degree of enrichment by ele- 
ments heavier than helium generally match the 
expected properties of such objects, but none 
of these properties has been measured for the 
earliest galaxies. 

Izotov et al. report that J0925 +1403 leaks a 
large number of ionizing photons relative to its 
ultraviolet luminosity. This finding will inform 
future theoretical models of the reionization of 
the Universe by faint galaxies, but it remains 
to be determined whether this result is typi- 
cal in the local Universe or representative of 
galaxies in the reionization era. The authors’ 
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detection therefore emphasizes the need for 
additional, larger studies to develop a statistical 
understanding of Lyman continuum escape in 
the local Universe and its relationship to the 
properties of galaxies more generally. m 
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Transcriptional control 
of endothelial energy 


The formation of blood vessels requires rapid proliferation of endothelial cells. 
The transcription factors FOXO1 and MYC have been found to regulate the 
metabolism and proliferation of vascular endothelial cells. SEE LETTER P.216 


CHRISTER BETSHOLTZ 


esearch during the past decade has 
Ris extensive knowledge of the 

cellular and molecular mechanisms 
of angiogenesis, the process through which 
new blood vessels form from existing ones. 
Although we have learnt much about how 
vessels sprout, elongate, branch, form lumens 
and regress’, one piece of the angiogenic puzzle 
has remained poorly explored: how the pro- 
liferation of the endothelial cells that line 
the vessels’ interior is regulated. In this issue, 
Wilhelm et al.” (page 216) show that the tran- 
scription factor FOXO1 couples growth-factor 
signalling to the metabolism, growth and 
division of endothelial cells. The authors also 
identify the protein MYC, a known driver of 
cancer development and the anabolic metabo- 
lism that constructs tissues’, as a key mediator 
of endothelial FOXO1 function. 

Unicellular organisms have evolved to grow 
and divide whenever nutrients are abundant, 
and, conversely, to enter states of inactivity 
when nutrients are scarce. Multicellular organ- 
isms function differently. In our bodies, most 
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cells are exposed to a constant, rich supply of 
nutrients, but proliferate only when stimulated 
by growth factors, such as during organ devel- 
opment and regeneration. For endothelial 
cells, the situation differs again — these cells 
proliferate and form new blood vessels when 
oxygen and nutrients are low, with the aim of 
increasing oxygen and nutrient delivery to 
other cells in the tissues. Once functional ves- 
sels have been established, the endothelial cells, 
now also exposed to high levels of oxygen and 
nutrients, cease to proliferate. 

Signalling induced when vascular endo- 
thelial growth factor A (VEGFA) binds to 
VEGF receptor 2 (VEGFR2) is the prin- 
cipal driver of most of the fundamental 
morphogenetic events involved in angiogen- 
esis, including endothelial cell proliferation. 
A key pathway downstream of VEGFR2 is the 
PI3K-AKT pathway’ — a powerful regulator 
of glucose metabolism and protein synthesis’. 
The protein AKT also inhibits the activity of 
FOXO transcription factors by phosphorylat- 
ing them: this causes them to be redistributed 
from the cell nucleus to the cytoplasm’ (Fig. 1). 

Wilhelm et al. set out to test the idea that 
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Figure 1 | FOXO1 and MYC contribute to regulating angiogenesis. As new blood vessels form, 
extensive proliferation of the endothelial cells that line them takes place just behind the sprouting tip of 
the vessel. Wilhelm et al.’ show that, in these endothelial cells, the transcription factor FOXO1 is located 
in the cytoplasm, possibly as a result of having been phosphorylated (P) through the activity of the 
PI3K-AKT signalling pathway that is induced when vascular endothelial growth factor A (VEGFA) binds 
to its receptor (VEGFR2). As a consequence, FOXO1 cannot exert its function, described by the authors, 
in inhibiting the transcription factor MYC, which remains in the nucleus. The resulting enhanced MYC 
activity leads to increased cellular metabolism, growth and proliferation. By contrast, vessel maturation, 
which occurs more centrally in the vascular network, coincides with cessation of growth-factor signalling. 
Presumably at this stage, FOXO1, now non-phosphorylated, moves to the nucleus and inhibits MYC, 


thereby inducing endothelial-cell quiescence. 


FOXO1, the FOXO family member enriched 
in endothelial cells, might constitute a link 
between growth-factor signalling input, 
metabolism and cell proliferation (Fig. 1). The 
researchers inactivated the Foxo1 gene specifi- 
cally in endothelial cells of newborn mice and 
found that this led to overgrowth of these cells 
and the formation of a hugely disorganized 
and dilated vascular network in the develop- 
ing mouse retina. Conversely, endothelial- 
specific expression of a constitutively active 
FOXO1 protein resulted in a sparse retinal 
vasculature composed of fewer than normal 
endothelial cells. 

Expression of constitutively active FOXO1 
also led to decreased glucose uptake, glycolysis 
and lactate production in endothelial cells. The 
authors further observed decreases in oxygen 
consumption, the production of reactive oxy- 
gen species and levels of the energy-carrying 
molecule ATP — all features that correspond 
with reduced cellular metabolic activity. The 
cells survived, but entered a state of metabolic 
quiescence (a form of dormancy), accompa- 
nied by lower expression of genes that are tar- 
geted by the transcription factor MYC. 

Because MYC is known to regulate all of the 


above-mentioned metabolic processes, and 
because inhibition of MYC by FOXO occurs 
in other cells°, the authors tested whether 
MYC might be the mediator of the prolifera- 
tion-stimulating effect of Foxo1 inactivation. 
They found that constitutively active FOXO1 
suppressed MYC expression and inactivation 
of Foxo1 had the opposite effect. FOXO1 also 
increased expression of negative regulators of 
MYC activity, including MXI1 and FBXW7, 
suggesting that FOXO1 inhibits MYC at sev- 
eral levels. Furthermore, the authors show that 
MYC overexpression restored metabolism and 
proliferation in endothelial cells with consti- 
tutively active FOXO, and repaired vascular 
defects induced by this treatment. Together, 
these data provide compelling evidence for 
MYC as an effector of Foxol deficiency in 
endothelial cells. 

Wilhelm and colleagues’ work reveals a 
central mechanism whereby the control of 
endothelial-cell proliferation is linked to the 
cells’ metabolic state. But as with all good 
studies, it generates many questions. Does 
the amount of FOXO1 change during angio- 
genesis, or is its activity regulated solely by 
transport in and out of the nucleus, as the 
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authors’ results might suggest (see Figure la 
of the paper’)? Although the experimen- 
tal methods used by the authors are the best 
available, they involved vast changes in FOXO1 
levels (complete loss or several-fold increase), 
which would not occur in physiological set- 
tings. The striking normalization of the retinal 
vasculature observed in mice overexpressing 
both constitutively active FOXO1 and MYC 
might result from a new balance, achieved 
through similarly increased levels of the two. 
A more critical test of the role of FOXO1 as a 
‘rheostat’ of vascular expansion, a term used 
by the authors, should ideally include manip- 
ulations of the activity of FOXO1 — such as 
of its nuclear-cytoplasmic shuttling — at 
normal levels. 

Moreover, the question of how FOXO1 
nuclear translocation is regulated in endothelial 
cells remains unresolved. In analogy with the 
regulation of other FOXO proteins bya growth- 
factor-PI3K-AKT axis, one would guess that 
signalling downstream of VEGFA~VEGFR2 
binding plays a central part. However, FOXOs 
have several upstream inputs besides AKT®, 
and FOXO1 signalling and its role in endothe- 
lial cells might be multifaceted and context- 
dependent. For example, in adult mice, deletion 
of multiple FOXO proteins, including FOXO1, 
leads to the formation of benign endothelial- 
cell tumours known as haemangiomas in some 
organs, but not all’. Similarly, VEGF induces 
enhanced proliferation in cultures of non- 
FOXO-expressing endothelial cells from some 
organs, but not others’. 

Although this association between FOXO 
dysregulation and endothelial tumour forma- 
tion concurs with Wilhelm and colleagues’ 
idea that FOXO1 is a major regulator of 
endothelial-cell proliferation, and extend 
their observations into adult animals, the non- 
uniform and tissue-type-dependent responses 
are intriguing. Further study of FOXOs is 
warranted, particularly in the regulation of 
endothelial metabolism and proliferation at 
different stages of development and different 
vascular sites. = 
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CLIMATE SCIENCE 


Earth’s narrow escape 
from a big freeze 


An equation has been derived that allows the timing of the onset of glaciations 
to be predicted. This confirms that Earth has just missed entering a new glacial 
period, and is unlikely to enter one for another 50,000 years. SEE LETTER P.200 


MICHEL CRUCIFIX 


subject of much debate is whether 
Arn levels of carbon dioxide 

were already significantly altered by 
emissions associated with human activities 
before the Industrial Revolution in the eight- 
eenth century. One estimate suggests that the 
atmospheric concentration of CO, would 
have been only 240 parts per million (p.p.m.) 
in an agriculture-free world, rather 
than 280 p.p.m., as was measured 
just before the Industrial Revolu- 
tion’. On page 200 of this issue, 
Ganopolski et al.’ report modelling 
studies confirming that we would 
now be entering an ice age if the 
concentration had remained at 
240 p.p.m. By contrast, they report 
that glacial inception — the onset 
of an ice age — could not have 
occurred at CO, concentrations 
that were typical of the eighteenth 
century. 

The Quaternary period has 
conventionally been divided 
into two epochs: the Pleisto- 
cene, which lasted from about 
2.59 million to 12,000 years 
ago, and the Holocene, which 
followed the Pleistocene and 
continues to the present day. The 
Pleistocene was a time of great, 
successive glaciations interspersed 
with interglacial periods, during 
which environmental conditions 
were similar to those occurring 
today. During the Holocene — the 
latest interglacial period — humans 
invented agriculture, and their 
impact on the environment 
increased at an exponential rate. 
One of the signatures of this impact 
is the rising concentration of CO, 
in the atmosphere. But at what 
point does this impact become 
sufficiently large to affect climate 
and glacial inception? 

A modelling study’ in 2000 
established that pre-industrial 
levels of CO, were high enough to 
guarantee a period of interglacial 


conditions for at least 50,000 years (Fig. 1). 
Consistent with Ganopolski and colleagues’ 
findings, this earlier study also predicted 
that the next glacial inception (which would 
have led to a glaciation that reached an ice 
maximum 60,000 years from now), could not 
now occur owing to the warming effect of 
anthropogenic emissions. Moreover, proba- 
bilistic assessments*” of the timing of the 
next glacial inception have been provided by 


Figure 1 | An eighteenth-century smokehouse. The atmospheric level of 
carbon dioxide just before the Industrial Revolution was 280 parts per million 
and may already have been affected by emissions associated with human 
activities. Ganopolski et al.’ report models suggesting that atmospheric CO, 
levels typical of the eighteenth century were high enough to prevent the onset 
ofa glacial period for 50,000 years. 
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using simple dynamical systems for climate 
prediction, calibrated using data about past 
CO, levels and ice volumes. All of these stud- 
ies used different models and assumptions, 
but they broadly agree on the potential tim- 
ing for a glacial inception because their fore- 
casts are determined by predictable drops in 
incoming solar radiation (insolation) in the 
Northern Hemisphere caused by changes in 
Earth's orbit. 

Ganopolski and co-workers’ study is an 
advance on previous work because it provides 
a simple equation for predicting when glacial 
inception will occur. The researchers observed 
that, in the Earth-system model they used for 
their study (CLIMBER-2), ice begins to form 
when insolation in the Northern Hemisphere 
at the summer solstice falls below a certain 
value that depends logarithmically on the con- 
centration of atmospheric CO,. They were thus 
able to work out an equation that describes this 
behaviour. 

To calibrate the equation, the authors 
performed several simulations that differed by 
the value of a parameter that con- 
trols cloud height in their model. 
This sampling process effectively 
generates a family of model ver- 
sions, which the authors tested to 
see which ones predicted past gla- 
cial inceptions. Past glaciations and 
interglacials have been identified 
on the basis of isotopic data from 
marine sediments, and they fol- 
low a numbering scheme in which 
isotope ‘stages’ with odd numbers 
roughly correspond to interglaci- 
als. The authors paid special atten- 
tion to the glacial inceptions after 
marine isotope stages 19 and 11, 
and to the period after marine 
isotope stage 1 (that is, the Holo- 
cene), because insolation evolved 
ina similar way at those times but 
led to different outcomes (stage 1 
did not produce a glacial incep- 
tion). Only the parameter values 
that yielded correct simulations of 
all past glacial inceptions were used 
to establish the equation. 

The authors were thus able to 
confirm that Earth had a narrow 
escape from glacial inception dur- 
ing the Holocene: the increase 
in atmospheric CO, levels dur- 
ing this period was sufficient to 
prevent the planet from entering 
a glacial period. The authors also 
report that an interglacial climate 
would have continued for at least 
20,000 years, and more plausibly 
for 50,000 years, if CO, concentra- 
tions had been sustained at levels 
typical of the eighteenth century. 
However, almost 500 gigatonnes of 
carbon (GTC; 1 GTC is equivalent 
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to 3.6 gigatonnes of CO,) have been released 
into the atmosphere since the Industrial 
Revolution. Ganopolski et al. show that this 
means that we will probably skip the next 
glacial inception too: emissions of 1,000 GTC 
(a scenario that is quite likely) will almost guar- 
antee 100,000 years without any glaciation. 

Such long-term consequences may seem 
surprising, given that the emissions will occur 
over a few centuries at most and that anthro- 
pogenic CO, will eventually be absorbed by 
the oceans. But for this absorption to occur, 
carbonate minerals in the ocean will need to be 
dissolved, to counteract the increase in ocean 
acidity that occurs when CO, is absorbed, and 
which limits the amount of CO, that can be 
dissolved. This takes time. In fact, the mean 
half-life of CO, in the atmosphere is of the 
order of 35,000 years®. Consequently, anthro- 
pogenic CO, will still be in the atmosphere in 
50,000 years’ time, and even 100,000 years, 
which is enough to prevent any glaciation. 

The method used by Ganopolski et al. is 
known as ‘perturbed physics’ sampling. This 
means that the different scenarios for future 
climate were sampled by modifying one 
parameter, which controls one of the physi- 
cal effects described by the model. But no 
model is perfect, and all the possible errors 
associated with the model cannot be entirely 
compensated for by adjusting this parameter. 
To provide better predictions, we need to pay 
special attention to climate processes that are 
currently not well quantified. 

Among them, the causes of CO, changes 
during past interglacial periods and during 
the early stages of glaciation remain a matter 
of controversy. For example, we are uncertain 
about the amplitude and dynamics of carbon 
sequestered in peatlands’”. More fundamen- 
tally, we do not yet know whether natural CO, 
dynamics have an active role in causing glacial 
inception, or whether they passively amplify 
the effects of accumulating ice at northern 
high latitudes. In spite of these uncertainties, 
Ganopolski and colleagues’ main conclusion 
is likely to stand. It reinforces previous assess- 
ments asserting that humanity’s collective 
footprint on Earth already extends beyond any 
imaginable future of our society. = 
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A trail map for 
trait-based studies 


Global assessments of variation in plant functional traits and the way that these 
traits influence competitive interactions provide a launching pad for future 
ecological studies. SEE ARTICLE P.167 & LETTER P.204 


JONATHAN M. LEVINE 


cologists explore the processes that 
Hee the natural world around us. 

But this can seem an uphill battle when 
nature presents such a wide diversity of species, 
each with its own set of interactions with the 
environment. One way to make sense of this 
diversity and its mechanistic underpinnings is 
to focus not on species but on the functional 
traits they possess, such as plant height, seed size 
or leaf area. Two papers in this issue advance 
our understanding of how traits vary between 
plant species, and the ramifications of this vari- 
ation for competitive interactions. Diaz et al." 
(page 167) document the patterns of functional- 
trait variation among plant species worldwide 
and reveal fundamental constraints on plant 
form that allow the organisms to survive natural 
selection, physiological challenges and competi- 
tive exclusion. Kunstler et al.” (page 204) show 
how functional traits consistently predict the 
competitive interactions between trees in six 
forested biomes, with effects counter to expec- 
tations from classic theory. 


In the nineteenth century, Alexander von 
Humboldt and Charles Darwin wrote at length 
about the surprising diversity in form and 
function of organisms on Earth, and this topic 
still intrigues naturalists today. For much of the 
history of ecology, most patterns of diversity 
and abundance have been studied at the spe- 
cies level. But a growing number of ecologists, 
and plant ecologists in particular, think that 
studies focused on functional traits present 
greater opportunity for generality and predict- 
ability, and a tighter connection to organismal 
function*®. Indeed, it is a species traits that 
determine the organism's growth, dynamics 
and interactions, not its taxonomic nomencla- 
ture. The implication is that a more productive 
way of asking, for example, the classic question 
of what processes maintain the diversity of spe- 
cies is to ask what processes explain the disper- 
sion of traits among community members. 

Diaz et al. have laid the groundwork for this 
approach and a wide range of ecological and 
evolutionary investigations by quantifying 
the dimensionality of plant ‘trait space’ — the 
multivariate space in which a plant can be 


a b 
' ae a bak sie 
cote => ee a oot = 
fe} 
Predicted weak = 
He competition & 
@ ® 5 
iS 1S) 
Pe) je} 
38 | 9 
a” ook esl <== iS) 
Predicted § 
intense 2 
competition & 
vy —ae- > 
reps (Bs Wood density 
Low < > High 
Leaf mass per area 
High < > Low 


Nitrogen content 


Figure 1 | Trait dimensions and competition. a, Diaz et al.’ conducted an analysis of six plant traits 
across more than 45,000 species and found that most of the variation between species in this six- 
dimensional trait space lies along a two-dimensional plane. One dimension describes variation in 

leaves having an acquisitive strategy (low leaf mass per unit area and high nitrogen content) versus 

a conservative one (high leaf mass per unit area and low nitrogen content)°. The other dimension 
corresponds to variation in plant size (height and seed mass). Following classic ecological theory, species 
that are more similar to one another in the trait space should compete more intensely than would more 
different taxa. b, However, by working with a global data set of forest-tree growth, Kunstler et al.” found 
little to no support for this expectation. Instead, certain trait values were predictive of competitive 
superiority, such as greater wood density being associated with greater resistance to competition. 
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located on the basis of its functional traits. The 
researchers focused on six traits that are read- 
ily available for a large number of taxa and that 
are central to determining a plant's ecological 
strategy: plant height, leaf area, leaf mass per 
area, nitrogen content per mass, stem mass 
per volume and seed mass. Their analysis, 
which incorporates these values from more 
than 45,000 species, is the first of its scope to 
explore relationships across seed, leaf, stem and 
whole-plant traits. This scale was possible only 
because of a database called TRY (ref. 7), which 
contains 5.6 million records of plant functional 
traits assembled over the past decade. 

In principle, a given plant could occupy 
any point within this six-trait space. To assess 
how constrained plant species actually are 
within this space, the authors compared their 
observations with four null models represent- 
ing different distributions of and correlations 
between traits. They found that, worldwide, 
plant species occupy only a small fraction of 
their potential trait space, and the observed 
pattern is driven largely by strong correla- 
tions between functional traits across species 
(Fig. 1). The researchers then conducted a 
principal component analysis and identified 
two primary dimensions in which plants vary 
globally: plant size, ranging from short species 
with small seeds to tall species with large seeds; 
and leaf strategy’, ranging from ‘acquisitive 
species with low leaf mass per unit area and 
high leaf nitrogen content to ‘conservative’ spe- 
cies with high leaf mass per unit area and low 
nitrogen content. 

It is difficult to evaluate the extent to which 
the constraints on plant form and function 
suggested by this analysis arise from bio- 
mechanical trade-offs, natural selection or 
competition. But this is where Kunstler and 
colleagues’ study comes in. These research- 
ers explored how three plant functional traits 
(leaf area per mass, plant height and wood 
density) predict the competitive interactions 
between forest tree species. Their data set is 
similarly impressive in scope to that of Diaz 
et al. — it includes three functional traits and 
trunk-diameter growth for more than 3 mil- 
lion trees from over 2,500 species in forest plots 
from 6 biomes. Taking advantage of natural 
variation in the density and identity of com- 
petitors surrounding a focal tree, the authors 
built a statistical model to quantify howa spe- 
cies’ trait values predict its growth without 
competition, its resistance to competition and 
its ability to suppress the growth of neighbours. 

The authors had good reason to expect that 
functional traits would predict competitive 
dynamics. Ecological theory holds that trait 
differences between species should cause these 
taxa to use the environment in different ways, 
resulting in a ‘niche difference’ that minimizes 
competition between species (Fig. 1a). Con- 
trary to these expectations, however, Kunstler 
and colleagues found little to no evidence 
that trait differences minimize competition 


between trees across the six forest biomes. 
Instead, certain trait values tended to pre- 
dict the competitive advantage of one species 
over others. Trees with high wood density, for 
example, tended to be most resistant to com- 
petition (Fig. 1b). These findings resonate with 
work*” positing that trait differences should 
predict both the niche differences that stabilize 
species coexistence and the competitive 
imbalances that drive species exclusion. 

If the three functional traits studied by 
Kunstler et al. predicted only competitive 
imbalances between forest trees, which traits 
explain their local coexistence? Although 
the authors point to trade-offs between high 
growth rate and competitive tolerance, their 
finding of greater competition within species 
than between species — a factor that also sta- 
bilizes species coexistence — must relate to 
traits other than those measured. This finding 
highlights the limitations of trait-based ecol- 
ogy. Although, in principle, all competitive 
dynamics must be explainable by plant traits, 
whether these are the functional traits that can 
be readily measured is an open question. 

Answering this question, as well as related 
ones at the population and ecosystem levels, 
will require further integration of functional 
traits and mathematical models along the 
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lines of Kunstler and colleagues’ approach. 
Nonetheless, ecologists have a growing need 
to efficiently predict the nature of competition 
between plant species that do not co-occur 
today but will in the future as climate change 
causes species to migrate, and to migrate at dif- 
ferent rates’. Trait-based approaches may be 
our only recourse. m 
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Pull out the stops 
for plasticity 


The strength of synaptic connections between neurons needs to be variable, 
but not too much so. Evidence now indicates that regulation of such synaptic 
plasticity involves a complex cascade of feedback loops. 


CHRISTINE E. GEE & THOMAS G. OERTNER 


earning is thought to manifest in the 
Le as physical changes that alter the 

strength of neuronal contact points 
called synapses. These contact points allow 
information to be transmitted from one neu- 
ron to another, and understanding the condi- 
tions that cause synapses to change strength 
(a phenomenon known as synaptic plasticity) 
has been a focus of neuroscience research for 
many years. Writing in Nature Communica- 
tions, Tigaret et al.’ challenge the prevailing 
idea that the local concentration of calcium 
ions (Ca”*) is the key factor that determines 
whether a synapse becomes stronger or weaker 
after repetitive activation. They propose that 
plasticity involves an intracellular signalling 
cascade that overrides a safety mechanism. 
This suggests that the default state of the 
synapse is not to be plastic. 
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The main excitatory neurotransmitter in the 
mammalian brain is the molecule glutamate. 
Glutamate is released from the presynaptic 
neuron, and the postsynaptic neuron is excited 
when the molecule binds to and activates 
specialized receptor proteins, most of which 
are ion channels called ionotropic glutamate 
receptors. When activated, these channels 
open and positively charged ions enter the cell, 
depolarizing (reducing the voltage across) the 
cell membrane. In addition, glutamate recep- 
tors that are not ion channels, called metabo- 
tropic glutamate receptors, activate various 
intracellular signalling cascades. Their effect 
on synaptic transmission is generally slower 
than that of ionotropic receptors, but they are 
crucial for healthy brain function’. 

In the neuronal structure known as the 
dendritic spine, which forms a single synaptic 
contact, the initial depolarization caused by 
activation of ionotropic glutamate receptors 
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Figure 1 | The promotion of plasticity. a, The molecule glutamate is transmitted across the synaptic 

cleft between neurons to activate the postsynaptic neuron. When the voltage across the cell membrane 
decreases (depolarization), glutamate-bound NMDA-receptor proteins and voltage-gated calcium 
channels (VGCCs) open, allowing calcium ions (Ca”*) to enter the cell. Under normal conditions, proteins 
called SK channels are activated by this Ca” influx. Potassium ions (K*) flow out through 

SK channels, decreasing depolarization and preventing changes in synaptic strength, known as plasticity. 
b, Tigaret et al.' report that the metabotropic glutamate receptor protein mGlu, is activated by sustained 
glutamate signalling, and leads to inhibition of SK channels. This slow-acting inhibition enables prolonged 
depolarization and triggers strengthening of the synapse, known as long-term potentiation (LTP). 


can be amplified by the opening of voltage- 
gated calcium channels, further depolarizing 
the spine. A special class of ionotropic gluta- 
mate receptors called NMDA receptors have a 
similar role — they open only when the neu- 
ron is already depolarized, forming a positive- 
feedback loop that increases Ca” influx and 
depolarization’. The activation of NMDA 
receptors is essential for many forms of long- 
lasting synaptic plasticity. 

However, positive-feedback loops are inher- 
ently dangerous for neurons — too much 
depolarization and Ca” can be toxic, eventu- 
ally triggering cell death. To prevent this from 
happening, spines have a safety mechanism 
in the form of calcium-activated, potassium- 
conducting SK channels*. When intra- 
cellular Ca** reaches a critical concentration, 
SK channels open, allowing positively charged 
potassium ions to exit the cell and so pre- 
venting further depolarization (Fig. la). 
This SK mechanism stops the positive- 
feedback loop, blocking further Ca” influx. 
But a side effect is that the synaptic strength 
becomes difficult to change’. 

Tigaret et al. describe a plasticity-enabling 
mechanism that inhibits SK channels in 
individual spines. They found that repeated 
sequential activation of pre- and postsynaptic 
neurons in slices of rat brains induced synap- 
tic strengthening, also known as long-term 
potentiation (LTP). In addition to NMDA- 
receptor activation, LTP induction required 
the activity of group 1 metabotropic gluta- 
mate receptors (mGlu,). The authors show 
that activation of mGlu, triggers a slow-acting 
mechanism that inhibits SK channels, allow- 
ing for sustained depolarization and enhanced 


Ca” entry into the spine (Fig. 1b). 

The authors used a strong induction pro- 
tocol (300 paired activations in 1 minute) to 
allow the relatively slow metabotropic pro- 
cess to take effect and enable LTP. This might 
seem unusual — after all, we don't need to be 
presented with information 300 times before 
learning a new association. Why was such a 
strong protocol required? 

In an intact brain, specific neuromodulator 
chemicals such as dopamine and acetylcholine 
are released when the animal is in an aroused 
state: for example, when it learns that a certain 
sound predicts a frightening event. These sub- 
stances modulate glutamate-activated synapses 
and have been shown to promote synaptic plas- 
ticity by blocking SK channels”®*. The timing 
window for successful induction of LTP has 
been shown’ to change radically in the presence 
of neuromodulators. Thus, it seems that there 
is not just one rule for how synapses change 
during learning, but a whole set that are tailored 
to various occasions such as different mental 
states. This makes sense from a systems per- 
spective — synaptic potentiation is gated not 
only by timing, but also by the brain’s reward 
system. From an experimental point of view, 
the lack of neuromodulatory inputs, which is 
an inherent limitation of brain-slice experi- 
ments, might explain why a strong protocol was 
required. Functional imaging of single synapses 
in live, active animals is not yet possible. 

The mechanism highlighted by Tigaret and 
colleagues is not the only way in which syn- 
apses can be strengthened. Enzymes called 
Src tyrosine kinases (which add phosphate 
groups to proteins) can directly enhance 
NMDaA-receptor function’. This pathway has 
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been shown to cause LTP in the same type of 
synapse as that analysed in the current study’. 
The activity of various metabotropic recep- 
tors, including mGlu,, can increase glutamate- 
mediated responses through this pathway’”. 
It will be interesting to investigate whether 
the mGlu,-triggered blockade of SK chan- 
nels identified by Tigaret et al. acts together 
with direct NMDA-receptor phosphorylation 
to enable LTP, or whether one mechanism is 
dominant under specific conditions, depend- 
ing, for instance, on cell type or the age of 
the animal. 

This study also confirms" that, contrary 
to general thinking, it is not possible to pre- 
dict the direction and magnitude of synaptic 
plasticity by simply analysing levels of Ca”* 
in dendritic spines. For example, a pair of 
presynaptic stimulations triggered a very 
strong Ca” influx into the spine, but no 
plasticity whatsoever. But before Ca” is dis- 
carded as the key state variable, we must 
consider that successful induction of long- 
term plasticity relies on the interplay of 
local synaptic Ca” signals with Ca™ signals in 
the cell body (soma) of the neuron. Indeed, the 
authors emphasize the importance of postsyn- 
aptic electrical activity and the activation of 
voltage-gated calcium channels for LTP. These 
processes are not restricted to the active spine; 
they increase Ca” levels throughout the neu- 
ron. Thus, it might be possible to predict the 
future strength of a synapse from simultaneous 
Ca** measurements in the spine and soma. This 
is certainly not an easy experiment, but sophis- 
ticated 3D scanning microscopes could be used 
to analyse compartmentalized Ca” signalling 
in individual neurons — and perhaps one day 
in intact animals during learning. m 
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The global spectrum of plant form and 


function 


Sandra Diaz!, Jens Kattge?*, Johannes H. C. Cornelissen‘, Ian J. Wright®, Sandra Lavorel®, Stéphane Dray’, Bjorn Reu®?, 
Michael Kleyer!”, Christian Wirth?*", I. Colin Prentice®’, Eric Garnier!’, Gerhard Bonisch?, Mark Westoby°, 

Hendrik Poorter", Peter B. Reich’!°, Angela T. Moles”, John Dickie", Andrew N. Gillison!’, Amy E. Zanne??*!, 

Jérome Chave”’, S. Joseph Wright?’, Serge N. Sheremet’ev™, Hervé Jactel***°, Christopher Baraloto”’**, Bruno Cerabolini”’, 
Simon Pierce*°, Bill Shipley*!, Donald Kirkup*’, Fernando Casanoves*’, Julia S. Joswig*, Angela Gtinther?, Valeria Falczuk', 


Nadja Rtiger?*, Miguel D. Mahecha? & Lucas D. Gorné! 


Earth is home to a remarkable diversity of plant forms and life histories, yet comparatively few essential trait combinations 
have proved evolutionarily viable in today’s terrestrial biosphere. By analysing worldwide variation in six major traits 
critical to growth, survival and reproduction within the largest sample of vascular plant species ever compiled, we found 
that occupancy of six-dimensional trait space is strongly concentrated, indicating coordination and trade-offs. Three- 
quarters of trait variation is captured in a two-dimensional global spectrum of plant form and function. One major 
dimension within this plane reflects the size of whole plants and their parts; the other represents the leaf economics 
spectrum, which balances leaf construction costs against growth potential. The global plant trait spectrum provides a 
backdrop for elucidating constraints on evolution, for functionally qualifying species and ecosystems, and for improving 
models that predict future vegetation based on continuous variation in plant form and function. 


Vascular plants are the main entry point for energy and matter into the 
Earth’s terrestrial ecosystems. Their Darwinian struggle for growth, 
survival and reproduction in very different arenas has resulted in an 
extremely wide variety of form and function, both across and within 
habitats. Yet it has long been thought’ that there is a pattern to be 
found in this remarkable evolutionary radiation—that some trait con- 
stellations are viable and successful whereas others are not. 

Empirical support for a strongly limited set of viable trait combina- 
tions has accumulated for traits associated with single plant organs, 
such as leaves”?~!?, stems!*!4 and seeds!>~!”. Evidence across plant 
organs has been rarer, restricted geographically or taxonomically, and 
often contradictory'*”. How tightly whole-plant form and function 
are restricted at the global scale remains unresolved. 

Here we present the first global quantitative picture of essential 
functional diversity of extant vascular plants. We quantify the volume, 
shape and boundaries of this functional space via joint consideration 
of six traits that together capture the essence of plant form and func- 
tion: adult plant height, stem specific density, leaf size expressed as 


leaf area, leaf mass per area, leaf nitrogen content per unit mass, and 
diaspore mass. Our dataset, based on a recently updated communal 
plant trait database*®, covers 46,085 vascular plant species from 423 
families and to our knowledge spans the widest range of growth-forms 
and geographical locations to date in published trait analyses, includ- 
ing some of the most extreme plant trait values ever measured in the 
field (Table 1, Extended Data Fig. 1). On this basis we reveal that the 
trait space actually occupied is strongly restricted as compared to 
four alternative null hypotheses. We demonstrate that plant species 
largely occupy a plane in the six-dimensional trait space. Two key trait 
dimensions within this plane are the size of whole plants and organs 
on the one hand, and the construction costs for photosynthetic leaf 
area, on the other. We subsequently show which sections of the plane 
are occupied, and how densely, by different growth-forms and major 
taxonomic groups. The design opportunities and limits indicated by 
today’s global spectrum of plant form and function provide a founda- 
tion to achieve a better understanding of the evolutionary trajectory 
of vascular plants and help frame and test hypotheses as to where and 
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Table 1 | Range of variation in functional traits, geographic distribution and climatic conditions 


Abbreviation Range No. of species 

Adult plant height (m) H 0.001* to 907 24,720 
Stem specific density (mg mm~) SSD 0.06 to 1.39§ 11,356 
Leaf area (mm?) LA 0.79* to 2.79 x 10®| | 12,173 
Leaf mass per area (g m~?) LMA 4.9] to 1,507# 10,490 

content per unit leaf mass (mg g~') Nivass 2.48% to 68.98** 8,695 
Diaspore mass (mg) SM 5.15 10-++ to 2.05 x 1074+ 24,779 
Diaspore mass (mg) excluding pteridophytes SM 3.0 x 10~48§ to 2.05 x 1074+ 24,449 
Latitude (degrees) 55 S to 83.17 N 
Altitude (m) -—59 to 5,249 

ean annual temperature (°C) =27.22 1029.97 

ean annual sum of precipitation (mm yr~!) <5 to 7,693 


Latitude and altitude are based on species occurrences in the Global Biodiversity Information Facility database (http://www.gbif.org). Mean annual temperature and annual sum of precipitation refer 
to CRUO.5 degree climatology. *Wolffia arrhiza and Azolla microphylla; {Sequoia sempervirens and Eucalyptus regnans; {Utricularia vulgaris; §Caesalpinia sclerocarpa; | | Victoria amazonica; {Myriophyllum 


aquaticum; #Agave americana; ® Hakea erecta; **Dipcadi glaucum; +{Blechnaceae; ++Lodoicea maldivica; §§Laelia undulata and Alectra vogelii. 


how ecological filtering and evolution might further shape the Earth’s 
plant trait space. 


The trait space occupied by plants worldwide 

Certain traits can be thought of as indexing positions of species along 
key dimensions of plant ecological strategy directly relevant to growth, 
survival and reproduction!?”°”3!-3, We chose six traits whose fun- 
damental importance for ecological strategy has been established 
unequivocally and for which data have recently become available for 
an unprecedented number of species worldwide*”. Among the six key 
traits (see Methods for details and references) adult plant height (H) 
corresponds with the ability to pre-empt light resources and disperse 
diaspores. Stem specific density (SSD) reflects a trade-off between 
growth potential and mortality risk from biomechanical or hydraulic 
failure. Leaf area (LA, size of an individual leaf) has important conse- 
quences for leaf energy and water balance. Leaf mass per area (LMA) 
and leaf nitrogen content per unit mass (Nmass) express different aspects 
of leaf strategy for resource capture and conservation: LMA reflects 
a trade-off between carbon gain and longevity, while Nmass reflects 
a trade-off between the benefits of photosynthetic potential and the 
costs of acquiring nitrogen and suffering herbivory. Diaspore mass (the 
mass of an individual dispersed seed or spore; SM) reflects a trade- 
off between seedling survival versus colonization ability in space and 
time. Ranges of trait variation span from 2 (SSD, Nmass) to 13 orders of 
magnitude (SM) (Table 1). 

We investigated which portion of the six-dimensional trait space is 
occupied by vascular plants that now live on Earth. There are two pri- 
mary reasons why plants might occupy a subset of the potential trait 
space: (1) values of independent traits are distributed along each axis 
in a clumped, non-uniform manner; and (2) there are inherent corre- 
lations between the values of different traits. We therefore built four 
null models varying the trait distributions and their correlation struc- 
ture. We computed the volume of the six-dimensional convex hull*4, 
i.e. the smallest convex volume in hyperspace that contains the (logjo- 
and z-transformed) observed values of H, SSD, LA, LMA, Nymass and 
SM (for a visualization see https://sdray.shinyapps.io/globalspectr/; 
Supplementary Application 1), and compared it against hypervolumes 
from four null hypotheses (hVpm1 to hVam4; shown diagrammatically 
in Fig.1 and described in detail in Methods). Hypervolumes hVyin1 
to hvym3 assume that the traits vary independently, resulting in a 
functional space spanning along six orthogonal axes. Null model 1 
assumes that any combination of trait values can arise and escape 
natural selection with equal probability (for example ref. 35), thus 
extreme and central values are equally likely, each trait having a 
uniform distribution, and hv, approximating a hypercube. Null 
model 2 assumes that extreme trait values are selected against during 
evolution and each trait has a log-normal distribution, with hvym2 
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approximating a hypersphere. Null model 3 imposes no assumptions 
about trait distributions but instead allows each trait to be distrib- 
uted as observed and assumes traits are independent of one other. 
Null model 4 assumes that extreme values are selected against (i.e., 
log-normally distributed) and maintains the observed correlation 
structure among traits. Relative to null models 1 to 3, null model 4 
collapses the multidimensional trait-space occupied by plants (hVpma) 
into an elongated hyperellipsoid. 

We found that the observed hypervolume (hv 5) is much smaller 
than hypervolumes expected under the first three null models (hVami 
to hVams) (Fig. 1). While closer in size to hVpma, it is still 20% smaller. It 
also shows greater aggregation of species (‘lumpiness’) in multivariate 


Null model 2 
Traits normally 
distributed and 
independent 
from each other 


Null model 1 
Traits uniformly 
distributed and 
independent 
from each other 


DVamt AVam2 


Pa -72.43 % 


Observed spectrum 


-98.05 % 


-63.91 % -18.05 % 
HV ops 
nm3 hv 


hy, nm4 
Null model 3 Null model 4 
Traits distributed Traits normally 
as observed and distributed and 
independent from correlated as 
each other observed 


Figure 1 | The volume in trait space occupied by vascular plant species 
is strongly constrained compared to theoretical null models. The five 
diagrams are pictorial representations based on three out of the six trait 
dimensions forming the hypervolumes under scrutiny. The hypervolumes 
are constructed on the basis of logio- and z-transformed observed values 
of H, SSD, LA, LMA, Nmass and SM (observed hypervolume = hvops), or 
on the bases of four different null models of multivariate variation of 
those traits (hVpm1 to hVpm4) (see Methods). Numbers adjacent to arrows 
indicate percentage reductions in size of hv.}, compared to the null-model 
hypervolumes (all significant at P< 0.001). 
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space than expected under each of the null models (Supplementary 
Table 1). Thus the restriction of the observed hypervolume mainly 
reflects correlations among the six traits, and also—to a smaller 
degree—a greater concentration than expected under multivariate 
normality. In sum, the trait hypervolume occupied by living vascular 
plants is strongly constrained, converging towards a relatively small 
set of successful trait combinations. 


Main trends of variation 

Within the observed worldwide plant trait space we identified the 
main independent dimensions of variation. Seventy-four percent of 
the variation in the six-dimensional space was accounted for by the 
plane defined by the first two principal components (PC), the only 
PC found to contain significant, non-redundant information (Fig. 2, 
Extended Data Table 1 and Extended Data Fig. 2; all PC displayed at 
https://sdray.shinyapps.io/globalspectrPC; Supplementary Application 
2). Within this plane two notable dimensions of trait variation stand 
out. One dimension runs from short species tending to have small 
diaspores to tall species tending to have large diaspores (lower left 
to upper right in Fig. 2a, ‘H-SMy; more strongly associated with PC1 
than PC2). The other (upper left to lower right in Fig 2a, ‘LMA-Nmmass, 
more strongly associated with PC2 than PC1) runs from species with 
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cheaply constructed, ‘acquisitive’ leaves (low-LMA, nitrogen-rich) to 
species with ‘conservative’ leaves (high-LMA, nitrogen-poor) that are 
expected to have longer leaf lifespan and higher survival in the face of 
abiotic and biotic hazards”!°®, Stem specific density (SSD) and leaf 
area (LA) also load heavily on the plane and are correlated with both 
the H-SM and the LMA-Nymass dimensions (Fig 2a, Extended Data 
Table 1). Although SSD and SM increase with plant height, at any 
given H there is considerable independent variation in both (Extended 
Data Fig. 3a, f), and at any given LMA and Nmnass there is considera- 
ble independent variation in LA (Extended Data Figs 3b, c). These 
general patterns are robust (Extended Data Table 1) with respect to 
species selection (for example, considering angiosperms rather than 
all species), to re-running analyses on a 45,507-species ‘gap-filled’ trait 
matrix rather than the 2,214-species six-trait matrix, and to exclusion 
of individual traits (for example, using only one rather than both leaf 
economic traits). The outer reaches of the main plane of variation 
represent extreme combinations of plant size and leaf structure and 
function (see circled numbers in Fig. 2a, and Extended Data Table 
2 for illustrative species), with a wide gradient of intermediate trait 
combinations between them, together expressing the rich variety of 
ways in which plants balance the challenges of growth, survival and 
reproduction. 
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Figure 2 | The global spectrum of plant form and function. a, Projection 
of global vascular plant species (dots) on the plane defined by principal 
component axes (PC) 1 and 2 (details in Extended Data Table 1 and 
Extended Data Fig. 2). Solid arrows indicate direction and weighing of 
vectors representing the six traits considered; icons illustrate low and 

high extremes of each trait vector. Circled numbers indicate approximate 
position of extreme poles of whole-plant specialization, illustrated by 
typical species (Extended Data Table 2). The colour gradient indicates 


regions of highest (red) to lowest (white) occurrence probability of species 
in the trait space defined by PC1 and PC2, with contour lines indicating 
0.5, 0.95 and 0.99 quantiles (see Methods, kernel density estimation). 

Red regions falling within the limits of the 0.50 occurrence probability 
correspond to the functional hotspots referred to in main text. 

b, c, location of different growth-forms (b) and major taxa (c) in the 
global spectrum. 
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Major taxa, growth-forms, and functional hotspots 
Different plant groups distribute unevenly in the global spectrum of 
form and function. Both herbaceous and woody growth-forms show 
considerable variation along the two main dimensions (Fig 2b). The 
two groups are offset along the H-SM dimension (Fig. 2b), with woody 
species, on average, being taller and having larger seeds and leaves; 
woody species also tend to have higher SSD and LMA than herba- 
ceous species (Extended Data Fig. 3a—e). Also, although taller species 
have larger seeds in both herbaceous and woody species-groups, the 
relationship is only very weak in herbaceous species (Extended Data 
Figs 3f and 4). In sum, the distinction in traits between herbaceous and 
woody growth-forms goes beyond the obvious difference in height and 
stem structure that has been recognized since antiquity*’. At the same 
time, there exist commonalities in trait coordination and trade-offs 
across both herbaceous and woody plants, shown here at a global scale 
for the first time. For example, herbaceous and woody plants overlap 
widely along the LMA-Nmass dimension (Fig. 2b), particularly in regard 
to Nmmass (Extended Data Fig. 3c), and LMA and Nmmass are largely inde- 
pendent from LA in both groups (Extended Data Fig. 3b, c). Further, 
while neither SSD nor LMA increases with plant stature within either 
group (Extended Data Figs 3a, e and 4), LA increases with H in both 
(Extended Data Fig. 3d). These multivariate trends are summarised by 
the clear distinction of herbaceous and woody species-groups along 
PCl, and their broad overlap along PC2 (Extended Data Fig. 2a). 

There are also strong differences in trait-space occupancy by major 
taxa. For gymnosperms, high costs of seed packaging and abortion 
are thought to set a lower bound on seed size***?. Accordingly in 
Fig 2c gymnosperms are confined to the right hand side (see also 
Extended Data Fig. 2b, and, for examples, Extended Data Table 2). 
The emergence of angiosperms allowed a considerable extension into 
smaller seed size*® that is manifest in extant species. This also opened 
up lifestyles involving colonization of open ground, shorter lifespans 
and herbaceous growth-form (towards the left of Fig. 2a). The sec- 
ond major angiosperm innovation whose footprint is evident in the 
trait space concerns xylem vessels. Angiosperm vessels are longer 
and larger-diameter conduits than gymnosperm and pteridophyte 
tracheids, permitting much higher hydraulic conductivities. This, 
together with a greater density of leaf veins, has allowed angiosperms 
to deliver a faster transpiration stream while requiring less volume 
within the leaf’. These anatomical innovations have made it possible 
for angiosperms to extend the range of leaf stomatal conductances 
and photosynthetic capacities to higher values (requiring coordinated 
higher Nyass) and the range of LMA to lower values compared to gym- 
nosperms and pteridophytes (Fig. 2c). Higher hydraulic conductivity 
presumably also enabled the evolution of very large leaves in angio- 
sperms, and a far wider variety in leaf morphology too. Nevertheless, 
while angiosperm innovations have expanded trait space considerably 
towards higher leaf Nynass and LA and (compared with gymnosperms) 
lower diaspore mass, angiosperms also converged on the same zones 
of trait space as gymnosperms and pteridophytes, as seen in the lower 
right and lower left of the global trait plane (Fig. 2c). 

There are two clear functional hotspots—areas of particularly 
dense species occupation—in trait space (Fig. 2a). The bimodality 
resides in H and in SSD, rather than in LMA, Nyass or LA (Extended 
Data Fig. 4). The first hotspot almost entirely corresponds to herba- 
ceous plants and lies at the core of the distribution of both graminoid 
(grass-like) and non-graminoid herbs, having small, acquisitive leaves 
and small seeds. The second hotspot lies within the trait space occu- 
pied by woody plants, positioned towards the upper right corner of 
Fig. 2a. It consists mostly of tree species of moderate to great height 
with large leaves and large seeds; plants other than angiosperms are 
almost completely absent from it. Many phylogenetically distant fam- 
ilies and orders of angiosperms are represented within each hotspot 
(Supplementary Table 2), indicating that these prevalent ecological 
trait constellations represent successful solutions acquired repeatedly 
through the evolutionary history of vascular plants. 
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Discussion 

Our findings show that the trait space currently occupied by vascular 
plants is quite restricted compared to the range of possibilities that 
would exist if traits varied independently. Importantly, this finding 
arises from the combined analysis of six traits describing different 
plant organs, and from a wider spread of taxa and life histories than 
has previously been possible. It yields the most comprehensive pic- 
ture to date of how the remarkable functional diversity of vascular 
plants seen on Earth today has been able to evolve within very general 
constraints. This worldwide functional six-trait space is wide, diverse 
and lumpy, with its fringes occupied by species (indicated with circled 
numbers in Fig. 2a) ranging from the short model plant thale cress 
(Arabidopsis thaliana) to the 60-m tall Brazil Nut tree (Berthollettia 
excelsa), from flimsy watermilfoil (Myriophyllum spicatum) to tough 
monkey puzzle tree (Araucaria araucana), from the tender but toxic 
devil's snare (Datura stramonium) to the hardy needlewood (Hakea 
leucoptera), from the minute leaves and seeds of heather (Calluna 
vulgaris) to the large leaves and seeds of lotus (Nelumbo nucifera) 
(description and additional illustrative species in Extended Data 
Table 2). Yet, this variation of the six key traits is largely concentrated 
into a plane. 

Stem density, leaf size and diaspore size represent trade-offs within 
distinct aspects of plant biology*’, and in previous studies of trends 
across different plant organs, these traits have shown considerable 
variation that is independent from whole-plant size and leaf carbon 
economy. However, those analyses were based on far more restricted 
datasets than considered here, in terms of growth-forms, habitats, or 
both, considering for example tropical woody species*!-*>”°, tem- 
perate semiarid pine forests”, or countrywide herbaceous floras”’. 
At the global scale of our study, these three traits do not constitute 
major independent dimensions; rather, substantial variation in them 
is captured by the plane that summarizes global variation in vascu- 
lar plant design (Fig. 2). Our results are correlative and cannot prove 
rigorously why such a large share of the potential trait volume is not 
occupied. Still, from first principles many more combinations of traits 
than those observed seem feasible as far as biomechanics and evolu- 
tionary genetics are concerned. We suggest the concentration into two 
dimensions and the lumpiness within that plane reflect the major trait 
constellations that are competent and competitive across the ecological 
situations available on Earth today. 

The patterns described here pertain to fundamental aspects of form 
and function critical to growth, survival and reproduction of the vast 
majority of vascular plants on Earth. Importantly, plants converge and 
diverge in many more ways than explored here, through variation in 
a vast array of traits beyond the scope of our analysis, related to the 
fine-tuning of different taxa to specific abiotic and biotic conditions 
in their habitat (for example refs 42-44). Such variation fits within the 
more general patterns shown here. 

More broadly, our findings are directly relevant to a number of 
long-running and emerging broad-scale scientific initiatives. First, 
our findings provide the widest empirical context so far for examining 
theories that have focused on plant ecological strategies—on differ- 
ent aspects of the Darwinian struggle for existence. For example, the 
H-SM dimension could be seen as reflecting the r (colonization) ver- 
sus K (exploitation) continuum**. The LMA-Nyass dimension reflects 
the A (adversity-selection) continnum*”*, acquisitive-conservative 
continuum>”!?” or leaf economic spectrum”. The positions signalled 
by numbers 3, 4, and 5 in Fig. 2a (and described in Extended Data 
Table 2) could arguably roughly correspond to the stress-tolerant, 
ruderal, and competitor strategies of Grime*'!'*. Interestingly, the 
functional hotspots lie at intermediate positions on the plane rather 
than at any of these extreme positions (that is, r versus K, acquisitive 
versus conservative resource economy, C, S or R-strategy). 

Second, the global spectrum we describe has potential to improve 
emerging large-scale vegetation and ecosystem models (for exam- 
ple see refs 47-49) because we clearly show (Fig. 2a and Extended 
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Data Fig. 4) that only a limited set of combinations are observed 
from six plant traits most fundamental to survival, growth and 
reproduction. 

More generally, our findings—as encapsulated in the plane of 
Fig. 2—establish a backdrop against which many other facets of plant 
biology can be placed into a broader context. Plant lineages, evolution- 
ary trajectories, and historical and contemporary plant communities 
and biomes can be mapped onto this global trait spectrum. Trait vari- 
ation in any given physical setting can be compared to the worldwide 
background. Model species such as Arabidopsis thaliana (located at 
one extreme of the spectrum) can be positioned against this backdrop, 
helping to judge how typical or otherwise their physiology and natural 
history might be. The global spectrum of plant form and function is 
thus, in a sense, a galactic plane within which we can position any 
plant—from star anise to sunflower—based on its traits. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Plant trait definitions and ecological meaning. Adult plant height (typical height 
of the upper boundary of the main photosynthetic tissues at maturity; hereafter H) 
is the most common measure of whole plant size and indicates ability to pre-empt 
resources, including the ability of taller plants to display their leaves above those of 
others and therefore outcompete them; it also relates to whole plant fecundity and 
facilitation of seed dispersal**°°-*. Taller plants intercept more light but, trading 
off against that, construction and maintenance costs and risk of breakage increase 
with height”»”*. Large stature has been repeatedly selected for in different lineages 
during the evolution of land plants, although achieved very differently in different 
clades”. 

Stem specific density (dry mass per unit of fresh stem volume; SSD) is a sec- 
ond key index of construction costs and structural strength. Although SSD is 
more commonly measured on trees, here we used data for both herbaceous and 
woody species. At least among woody species, stem specific density is positively 
linked with plant mechanical strength, hydraulic safety and resistance to biotic 
agents!*!45° In high-precipitation systems wood density underpins a succes- 
sional continuum running from low-SSD, fast-growing, light-demanding species 
to high-SSD, slow-growing, shade-tolerant species. More broadly SSD characterizes 
a trade-off between fast growth with high mortality rates versus slow growth with 
high survival’>°”. 

Leaf area (one-sided surface area of an individual lamina; LA) is the most com- 
mon measure of leaf size. It is relevant for light interception and has important 
consequences for leaf energy and water balance**-. LA affects leaf temperature 
via boundary layer effects. The larger the lamina, the lower the heat exchange, the 
diffusion of carbon dioxide and water vapour per unit leaf area between a leaf and 
the surrounding air. LA is known to be constrained by climatic and microclimatic 
conditions and also by the allometric consequences of plant size, anatomy and 
architecture®-®, 

Leaf dry mass per unit of lamina surface area (leaf mass per area; LMA) and 
leaf nitrogen content per unit of lamina dry mass (Nmass) reflect different aspects 
of leaf-level carbon-gain strategies, in particular the «leaf economic spectrum» 
running from “conservative” species with physically robust, high-LMA leaves with 
high construction costs per unit leaf area and long expected leaf lifespan (and thus 
duration of photosynthetic income) to “acquisitive” species with tender, low-cost 
low-LMA leaves, and short leaf lifespan>!18-20.326465, | MA relates the area of 
light interception to leaf biomass, being an expression of how much carbon is 
invested per unit of light-intercepting area, and thus reflecting a trade-off between 
construction cost and longevity of lamina. Nmass is directly related to photosyn- 
thesis and respiration, as the majority of leaf nitrogen is found in metabolically 
active proteins. Nmass reflects a trade-off between, on the one hand, two different 
costs that increase with higher Nmass (to acquire N, and potentially suffer more 
herbivory) and, on the other hand, the greater photosynthetic potential that higher 
Nmnass allows. 

Diaspore mass (mass of an individual seed or spore plus any additional struc- 
tures that assist dispersal and do not easily detach; SM) indexes species along a 
dimension describing the trade-off between seedling competitiveness and survival 
on the one hand, and dispersal and colonization ability on the other!®®~®. As a 
broad generalization small seeds can be produced in larger numbers with the same 
reproductive effort and, at a given plant height, be dispersed further away from 
the parent plant and form persistent seed banks, whereas large seeds facilitate 
survival through the early stages of recruitment, and higher establishment in the 
face of environmental hazards (for example deep shade, drought, herbivory)*2-”. 
Dataset description. We compiled a global dataset containing 46,085 species 
and 601,973 cells, of which 92,212 correspond to quantitative species-level trait 
information, based on > 800,000 trait measurements for the six traits of interest 
on > 500,000 plant individuals. The vast majority of data were compiled from 
pre-existing smaller datasets contributed to the TRY Plant Trait Database® (https:// 
www.try-db.org, accessed May 2015). The dataset was supplemented by published 
data not included in TRY and a small number of original unpublished data by W. 
J. Bond, J.H.C.C., $.Di., L. Enrico, M. T. Fernandez-Piedade, L.D.G., D.K., M.K., 
N. Salinas, E.-D. Schulze, K. Thompson, and R. Urrutia. The final dataset (BLOB) 
was derived from 175 studies! !1:13:20.21,23-25,56,57,64,65,73-235 

In this global analysis, each species, identified subspecies or local variety is 
represented by a single value for each trait. This value is the geometric mean of 
all the observations of a trait present in the TRY Plant Trait Database and addi- 
tional databases incorporated to the present dataset. The number of observations 
per trait and species range from a single one (in the case of rare, geographically 
restricted species) to hundreds (in the case of common species with wide geo- 
graphical distribution). In this way, the analysis incorporates the high intraspecific 
variation that is sometimes observed in widespread species*”!®>3638. In addition 
and fully acknowledging their existence, intraspecific variations are assumed to 


be comparatively small in the context of the vast range of variation contained in 
this worldwide dataset®”. 

Species names were standardized and attributed to families according to The 
Plant List (http://www.theplantlist.org/; accessed 2015). Attribution of families to 
higher-rank groups was made according to APG III (2009) (http://www.mobot. 
org/MOBOT/research/APweb/). Information about primary growth-form (wood- 
iness, woody, semi-woody, non-woody) and secondary growth-form (herbaceous 
non-graminoid, herbaceous graminoid, herbaceous non-graminoid/shrub, shrub, 
shrub/tree, tree, climber, succulent, other) was added based on a look-up table 
of categorical plant-traits30 (https://www.try-db.org/TryWeb/Data.php#3) and 
additional information from various sources; >86% species were allocated to cat- 
egories according to primary growth-form, and >80% according to secondary 
growth-form. 

Species distribution data were derived from the Global Biodiversity Information 
System (GBIF; http://www.gbif.org) and combined with 0.5 x 0.5 degree gridded 
long term climate information derived from CRU (http://www.cru.uea.ac.uk/data). 
Trait measurement. In the case of published datasets, trait measurement methods 
are in the original publications listed in Dataset description. In the case of unpub- 
lished records, they were measured following the protocols specified in the context 
of the LEDA project (https://www.leda-traitbase.org) or in ref. 239 (http://www. 
nucleodiversus.org/index.php?mod=page&id=79). All data were unit-standard- 
ized and subjected to error detection and quality control (see below). 

Treatment of pteridophyte spore mass. The trait values for diaspore mass of pterido- 
phytes were estimated based on literature data for spore radius (7). We made crude 
assumptions that spores would be broadly spherical, with volume = (4/3)r°, and 
that their density would be 0.5mg mm ~*. Although these assumptions were clearly 
imprecise, we are confident they result in spore masses within the right order of 
magnitude (and several orders of magnitude smaller than seed mass of spermato- 
phytes). Most data were from ref. 240, data for Sadleria pallida were from ref. 241, 
for Pteridium aquilinum from ref. 242, and for Diphasiastrum spp. from ref. 243. 
Treatment of stem specific density in herbaceous species. Data on stem specific 
density (SSD) are available for a very large number of woody species, but only 
for very few herbaceous species. To incorporate this fundamental trait in our 
analysis, we complemented SSD of herbaceous species using an estimation based 
on leaf dry matter content (LDMC), a much more widely available trait, and its 
close correlation to stem dry matter content (StDMC, the ratio of stem dry mass 
to stem water-saturated fresh mass). StDMC is a good proxy of SSD in herba- 
ceous plants with a ratio of approximately 1:1 (ref. 100), despite substantial dif- 
ferences in stem anatomy among botanical families™‘, including those between 
non-monocotyledons and monocotyledons (where sheaths were measured). We 
used a data set of 422 herbaceous species collected in the field across Europe 
and Israel, and belonging to 31 botanical families‘ to parameterize linear rela- 
tionships of StDMC to LDMC. The slopes of the relationship were significantly 
higher for monocotyledons than for other angiosperms (F = 12.3; P< 0.001); 
within non-monocotyledons, the slope for Leguminosae was higher than that 
for species from other families. We thus used three different equations to predict 
SSD for 1963 herbaceous species for which LDMC values were available in TRY: 
one for monocotyledons (SSD = 0.888 x LDMC + 2.69), one for Leguminosae 
(SSD = 0.692 x LDMC + 47.65), and a third one for other non-monocotyledons 
(SSD =0.524 x LDMC + 95.87). 

Error detection and data quality control. The curation of the dataset faced a double 
challenge: (1) detecting erroneous entries (due to errors in sampling, measurement, 
unit conversion, etc.); and (2) ensuring that extreme values that correspond to 
truly extreme values of traits in nature are not mistakenly identified as outliers 
and therefore excluded from the dataset. To deal with these challenges, we took 
the following approach: Trait records measured on juvenile plants and on plants 
grown under non-natural environmental conditions were excluded from the data- 
set. Duplicate trait records (same species, similar trait values, no information on 
different measurement locations or dates) and obvious errors (for example LMA 
<0) were excluded from the dataset. We then identified potential outliers follow- 
ing the approach described in ref. 30. Trait records with a distance of >4 stand- 
ard deviations from the mean of species, genus, family or higher-rank taxonomic 
group were excluded from the dataset unless their retention could be justified 
from external sources. Trait records with a distance of >3 standard deviations 
from the mean of species, genus, family or phylogenetic group were identified, 
checked by domain experts for plausibility and retained or excluded accordingly. 
The remaining dataset was used to calculate species mean trait values. Finally, the 
ten most extreme species mean values of each trait (smallest and largest) were again 
checked for reliability against external sources. 

Construction of observed and simulated six-trait convex hull hypervolumes. 
In order to explore the constraints underlying the trait space occupied by species 
in our dataset, we used the convex hull approach of ref. 34, which has been applied 
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successfully to a wide range of datasets, including disjoint ones”“’. The application 
of a recently developed—and therefore less widely tested—method proposed for 
“holey” datasets”*> yielded similar results. 

We computed a six-dimensional convex hull volume (i.e. a six-dimensional 
measure of the minimum convex volume of trait space occupied by species in our 
dataset, hereafter Hv,,;) on the basis of the observed values of H, SSD, LA, LMA, 
Nmass» and SM, and compared it to four null model volumes (hVpmi-4) constructed 
under four different sets of assumptions. In all four cases the null hypothesis was 
Ho: Hvobs = hVam and the alternative hypothesis was Hy: Hvops < hVnm (‘the vol- 
ume of the convex hull defined by the observed species is smaller than the vol- 
ume occupied by species if their traits were generated under the null hypothesis’). 
Observed data were log;9-transformed and standardized to zero mean and unit 
variance (z-transformation). Percentages in Fig. 1 indicate the reduction in size of 
the observed hypervolume compared to the average of 999 hypervolumes gener- 
ated from the assumptions of each null model (Monte-Carlo permutations”™*). To 
control for outliers, computation of volumes were performed on the observed and 
simulated convex hulls containing 95% of species located closest to the centroid. A 
visualization of the observed dataset and the four null models in three-dimensional 
trait spaces is available at https://sdray.shinyapps.io/globalspectr/ (Supplementary 
Application 1). The R script used for hypervolume computation is provided at ftp:// 
pbil.univ-lyon1.fr/pub/datasets/dray/Diaz_Nature/. 

Null model 1. Species traits vary independently and each of them comes from a uni- 
form distribution. This null model assumes that each of the six traits represents an 
independent axis of specialization (i.e. the traits define a six-dimensional manifold) 
and that the occurrence of extreme and central values is equally probable. This uni- 
form independent trait distribution represents a “Darwinian Demon”? scenario, 
where any combination of trait values can arise from mutation and escape from 
the natural selection process with equal probability. Simulated data are generated 
by randomly and independently sampling from independent uniform distribu- 
tions whose range limits are constrained to the 0.025 and 0.975 quantiles of the 
observed trait values. The shape of the hypervolume under this null model (hVam1) 
is a hypercube. 

Null model 2. Species traits vary independently and each of them comes from a nor- 
mal distribution. This null model assumes that all six traits evolve independently, 
as in null model 1. However, extreme trait values are selected against during evolu- 
tion. Simulated data were obtained by randomly and independently selecting from 
univariate normal distributions with standard deviation determined by the trans- 
formed observed data. The corresponding hypervolume (hVpm2) is a hypersphere. 
Null model 3. Species traits vary independently but —unlike in the previous 
models— there is no assumption about the distribution of trait variation; each 
trait varies according to the observed univariate distributions. Simulated data 
were obtained by permuting the values for each trait independently and therefore 
destroying the covariance amongst traits. Under this null hypothesis (hVjm3) the 
hypervolume can take many potential shapes, emerging from the possible combi- 
nations of independently sampled plant trait observations. 

Null model 4. Species traits are normally distributed and follow the estimated cor- 
relation structure of the observed dataset. This null model assumes that there are 
less than six independent axes of specialization and that extreme values are selected 
against. Simulated data were obtained by generating multivariate normal variables 
with standard deviations of the transformed observed data using the correlation 
structure of the observed dataset. The corresponding hypervolume (hVpma) is a 
hyperellipsoid. Deviations of observed data from null model 4 can be explained 
by deviations of the transformed observed univariate distributions from normal 
distributions, either showing lower tails than those expected in a normal distri- 
bution or by the non-observation of some combinations of extreme trait values, 
leading to truncated distributions, or by bimodal distributions. 

Test for concentration of species within the observed convex hull. For each trait, values 
were partitioned in 10 bins so that the multivariate space was divided in 10° cells. 
The number of species per cell was computed and cumulative frequency curves 
were built for observed data and null models. For each null model, we simulated 
999 datasets and computed the 0.025-0.975 interquantile range and the median. 
We then determined N10 and N50, the minimum number of cells needed to cover 
10% or 50% of species. 

Principal component analysis (PCA). We performed PCAs on different versions 
of the observed dataset and a gap-filled version using the statistical software pack- 
age InfoStat™4” and the R-function ‘princomp. Again all analyses were carried out 
on the correlation matrix of logj9-transformed variables (traits), which is equivalent 
to using standardized data (z-transformation), which is considered appropriate for 
data with different measurement scales”*. The number of significant PCA axes 
to be retained in order to minimize both redundancy and loss of information was 
determined using the procedure proposed by ref. 249, which allows one to test the 
significance of dimensionality in PCA. A visualization of the space occupied by 
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vascular plants in the space defined by all six PCA axes (three at a time) is availa- 
ble at https://sdray.shinyapps.io/globalspectrPC (Supplementary Application 2). 
Differences in the position of different major taxa and growth forms along PC1 
and PC2 were tested using analysis of variance (Extended Data Fig. 2). Because of 
the large number of data, we used an alpha level of 0.01 to reject the null hypoth- 
esis. ANOVA was carried out using a linear mixed model to take into account 
the lack of homoscedasticity due to different group sizes. We used AIC and BIC 
criteria to select the best model considering heterogeneous variances. When the 
ANOVA null hypothesis was rejected, means were compared using Fisher's least 
significant difference (P= 0.01). Data were analysed using the Ime function of 
the nlme**”’ and Ime4”°! R-packages”” interfaced by InfoStat Statistical Software 
version 2015 (ref. 247). 
Test for robustness and representativeness of multivariate analysis results. In order to 
test if results shown in Fig. 2 and Extended Data Table 1 were robust with respect 
to the selection of traits and species and representative for vascular plants, we 
conducted a number of analyses: exclusion of gymnosperms and pteridophytes 
(‘angiosperms only’), exclusion of individual traits, and comparison to a gap-filled 
dataset representing about 15% of extant vascular plant species worldwide. The 
trait exclusion tests excluded the following individual traits, one at a time: LMA, 
Nmass and SSD, because analyses indicated that LMA and Nmnass, although providing 
information on different aspects of leaf function (see Methods), are both part of 
the leaf economic spectrum”, and SSD and plant height both reflect plant size 
when woody and herbaceous plants are considered together. To test if the results of 
the multivariate analysis presented in Fig. 2 were representative of vascular plants, 
we constructed a gap-filled dataset based on those species that entered the global 
dataset via the TRY Plant Trait Database. We extracted 328,057 individual plant- 
level trait observations, which provide a substantial number of additional data not 
used in the main analysis. We applied the data selection process as described above 
(section: Error detection and data quality control). The resulting dataset contained 
78% missing entries (gaps), which were filled by Bayesian hierarchical probabil- 
istic matrix factorization (BHPMF)”°?**. The gap-filled dataset was then used 
to calculate species mean trait values, resulting in a gap-filled dataset for 45,507 
species. To quantitatively compare the results of the PCA presented in Fig. 2 and 
Extended Data Table 1 with those of the angiosperms-only and the gap-filled data- 
sets, we applied a Procrustes test”°° using the ‘procrustes’ and ‘protest’ functions 
in R package ‘vegan’?*”, Function protest tests the non-randomness between two 
configurations. Significant results (for example significance < 0.05) indicate that 
the shapes of two datasets are non-random to each other, but similar. 
Kernel density estimation. To estimate the occurrence probability of given com- 
binations of trait values in a two-dimensional space defined by PC axes 1 and 2 
(Fig. 2), and bivariate trait combinations (Extended Data Fig. 4), we used two- 
dimensional kernel density estimation”®*. Because results depend on the choice of 
the bandwidth used for the smoothing kernel, we used unconstrained bandwidth 
selectors”*’. To visualize the occurrence probability of a given trait combination 
in the PCA space as well as for all possible bivariate trait combinations, we con- 
structed contour plots from two-dimensional kernel density distributions. The 
colour gradient and contour lines in Fig. 2 and Extended Data Fig. 4 correspond 
to the 0.5, 0.95 and 0.99 quantiles of the respective probability distribution, thus 
highlighting the regions of highest and lowest trait occurrence probability. For 
kernel density estimation we used the ‘kde’ function and for optimal bandwidth 
selection carried out for each trait combination separately, we used the SAMSE 
pilot bandwidth selector”, both implemented in the R-package ‘ks’°’. The R 
script used is provided at ftp://pbil.univ-lyon1.fr/pub/datasets/dray/Diaz_Nature/. 
Code availability. The R scripts used to generate the hypervolumes (Fig. 2) and 
kernel density analyses associated to Fig. 2 and Extended Data Fig. 4 are available 
at ftp://pbil.univ-lyon1.fr/pub/datasets/dray/Diaz_Nature/. 
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Extended Data Figure 1 | Climatic and geographical coverage of the 
dataset. ad, Green points, occurrence according to GBIF (http://www. 
gbif.org) of species with information on all six traits (a, c) and at least one 
trait (b, d). Upper panels (a, b) show distribution in major climatic regions 
of the world; grey, MAP and MAT as in CRUO.5 degree climatology**; 
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Biome classification according to ref. 262. Lower panels (c, d) show 
distribution in the global map (Robinson projection); grey, land surface. 
Maps based on the R package ‘maps; accessed at The Comprehensive R 
Archive Network (https://cran.r-project.org/web/packages/maps/index. 
html). 


© 2016 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


Gymnosperm 
b L] Angiosperm 
[_] Pteridophyte 


4.50 Woo dy 
[_] Non-woody 


2.30- 2.30 


@ 0.10 0.10 
{eo} 
oO 
” 
O 
O s% 2.10 
-4.30 4.30 
6.50 1 6.50 
Extended Data Figure 2 | Tests of the distribution of growth-forms (a) of PCA analysis and a posteriori tests). Whiskers denote + 3 s.d. from 


and major taxa (b) in trait space. Woody and non-woody species differed mean; n woody = 1,001; n non-woody = 1,209; n angiosperms = 2,120; 
significantly in their positions along PC1 but not along PC2. Angiosperms _n gymnosperms = 80; n pteridophytes = 14). Horizontal bars and dots 


differed significantly from gymnosperms and pteridophytes in their within boxes indicate mean and median, respectively. Means with the same 
positions along both axes; gymnosperms and pteridophytes differed letter are not significantly different (Fisher's least significant difference; 
in their position along PC1 but not along PC2 (see Methods for details P>0.01). 
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Extended Data Figure 3 | Selected bivariate relationships underlying the global spectrum of plant form and function, showing herbaceous (green) 
and woody (black) species separately. See Extended Data Fig. 4 for standardized major axes statistics (slope, 1, sample size) of these and all other 
pairwise trait combinations. 
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Extended Data Figure 4 | Bivariate relationships between the six 

traits that underlie the global spectrum of plant form and function. 
The lower left portion of the matrix shows two-dimensional probability 
density distributions of bivariate trait relationships derived through 
kernel density estimation (see Methods). The colour gradient indicates 
regions of highest (red) to lowest (white) occurrence probability of trait 
combinations with contour lines indicating 0.5, 0.95 and 0.99 quantiles. 
The upper right portion contains standardized major axis (SMA)° 
statistics (slope, r’, sample size n, and statistical significance, NS, P > 0.05; 
*0.05 > P> 0.01; **0.01 > P> 0.001; ***P < 0.001 ) for the corresponding 


SSD (mg mm?) 


Nmass(mg g”') LMA (g m*) 


relationships for all species (a), and for herbaceous (h) and woody species 
(w) separately. The diagonal displays the total sample sizes for each trait. 
For traits showing a strongly bimodal distribution, the all-species slope 
and correlation should be treated with caution. Pteridophytes show a 
discontinuous distribution in SM, but otherwise fall well within the 
general distribution of points; they represent less than 1% of the dataset, 
therefore including or excluding them does not significantly alter any of 
the relationships. SMAs were fitted using SMATR v.2 (http://www.bio. 
mq.edu.au/ecology/SMATR/). 
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Extended Data Table 1 | Principal component analyses (PCAs) of global plant trait data 


ions Angiosperms = byel LMA —_EXcl. Ninass Excl. SSD Gap-filled 
analy sis only 

PC1 PC2 PCl PC2 PCl pc2 PCl pPCc2 PCl pPCc2 PCL PO 
a explained 49 25 50 24 52 24 56 93 45 30 42 24 
Eigenvalue 2.93 150 3.01 144 260 119 281 4117 226 4148 253 1.44 
bien eaeece 0.0014 0.0014 
Procrustes test 
Variable loadings 
Leaf area (LA) 0.23 0.58 030 O51 033 056 028 073 034 053 034 0.44 
Leafnitrogen per = gn5 9,57 0.21 0.60 -0.17 0.76 - - 0.27 062 -021 0.65 
mass (Nenass) 
Leaf mass per area 
(LMA) 0.40 -046 037 -0.49 - - 036 -060 0.42 -0.52 0.36 -0.56 
Plant height (H) 0.52 0.20 052 022 057 002 054 010 059 013 0.54 0.09 
i oo 0.45 030 046 025 O51 016 049 019 054 023 049 0.24 
Stem specific 
density (SSD) 0.51 -0.09 0.50 -0.14 052 -030 051 -026 - - 042 0.03 


Eigenvalues and trait loadings of principal components (PC1 and PC2) in six different PCAs. Main analysis corresponds to the PCA performed on 2,214 species for which values of all six traits were 
available, and which is reported in the main text and expressed graphically in Fig. 2. The rest of the columns correspond to PCAs carried out on angiosperms only (2,120 species), on all taxa but 
excluding LMA, Nmass or SSD one at a time (2,214 in all cases), and on a gap-filled dataset of 45,507 species with missing trait records imputed using BHPMF (See Methods). The results of all PCAs 
show strong similarity, indicating robustness of the pattern obtained in the main analysis. Only PC1 and PC2 were identified as significant (see Methods) and therefore are reported here. All PCAs were 
performed on the correlation matrix of logio-transformed traits. 
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Extended Data Table 2 | Description and illustrative examples of species at different positions at the margin of the global spectrum of plant 
form and function 


Brief description and examples 


@ Tall, very large-seeded trees with large leaves of intermediate LMA and Nyas;- Examples include the 


Neotropical Bertholettia excelsa (Brazil nut), Gustavia superba, Pentaclethra macroloba, and Omphalea 


Spp. 


@ Relatively large-seeded shrubs and trees of various heights, with small, sclerophyllous, highly conservative 
(high LMA, low Ninass) leaves. Examples include, among gymnosperms, monkey puzzle tree (Araucaria 
araucana), giant sequoia (Sequoiadendron giganteum) and junipers (Juniperus spp.). Among angiosperms, 
it includes members of the Proteaceae (e.g. the Australian Hakea and the South African Leucadendron 


genera) and Myrtaceae families (Melaleuca uncinata and Eucalyptus dumosa). 


(©) Sclerophyllous, high-LMA, low-Ninas; Species, of small stature, leaves, and seeds, varying from shrubs (such 
as Fumana thymifolia, heathers Calluna vulgaris and Erica tetralix, and chamise Adenostoma fasciculatum), 
to small forbs or sub-shrubs (such as Diapensia lapponica, Draba spp. and Sedum spp.), to graminoids (e.g. 


Muhlenbergia ramulosa and Aristida purpurea). 


(Cy) Submerged and semi-submerged aquatics (such as bladderwort Utricularia vulgaris, watermilfol 
Myriophyllum spicatum, Zannichellia palustris, and Ranunculus aquatilis) and ephemeral, small-seeded and 
small- and acquisitive-leaved (low LMA, high Ninass)) Species of very short stature, with very low investment 
in vegetative structures other than leaves (such as thale cress Arabidopsis thaliana, annual bluegrass Poa 


annua, and Nama dichotoma). 


® Large-leaved, high-Ninass herbaceous plants with little carbon investment in support tissue. These are 
illustrated by robust aquatic species such as larger pondweeds (Potamogeton spp.) and sacred lotus 
(Nelumbo nucifera), species with nitrogen-rich secondary compounds (presumably anti-herbivore defences; 
e.g. devil’s snare Datura stramonium, henbane Hyoscyamus niger), and some common crop and agricultural 
weed species such as Beta vulgaris, Phaseolus vulgaris, Cannabis sativa, and Arctium minus. Also includes 


Boreal ‘mega-herb’ Angelica archangelica. 


Circled numbers in the first column refer to extreme poles of whole-plant specialization, whose approximate positions in the plane defined by PC1 and PC2 are indicated within circles in Fig. 2a. 
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An ID2-dependent mechanism for VHL 
inactivation in cancer 


Sang Bae Lee!, Veronique Frattini!, Mukesh Bansal**, Angelica M. Castano!, Dan Sherman‘, Keino Hutchinson‘, 
Jeffrey N. Bruce®, Andrea Califano?*, Guangchao Liu!, Timothy Cardozo’, Antonio Iavarone!®’ & Anna Lasorella’”® 


Mechanisms that maintain cancer stem cells are crucial to tumour progression. The [D2 protein supports cancer hallmarks 
including the cancer stem cell state. HIFa transcription factors, most notably HIF2a (also known as EPAS1), are expressed 
in and required for maintenance of cancer stem cells (CSCs). However, the pathways that are engaged by ID2 or drive 
HIF2a accumulation in CSCs have remained unclear. Here we report that DYRK1A and DYRKIB kinases phosphorylate 
ID2 on threonine 27 (Thr27). Hypoxia downregulates this phosphorylation via inactivation of DYRK1A and DYRK1B. The 
activity of these kinases is stimulated in normoxia by the oxygen-sensing prolyl hydroxylase PHD1 (also known as EGLN2). 
ID2 binds to the VHL ubiquitin ligase complex, displaces VHL-associated Cullin 2, and impairs HIF2a ubiquitylation and 
degradation. Phosphorylation of Thr27 of ID2 by DYRK1 blocks ID2-VHL interaction and preserves HIF2« ubiquitylation. 
In glioblastoma, [D2 positively modulates HIF2a activity. Conversely, elevated expression of DYRK1 phosphorylates Thr27 
of ID2, leading to HIF2« destabilization, loss of glioma stemness, inhibition of tumour growth, and a more favourable 


outcome for patients with glioblastoma. 


The HIFa (hypoxia inducible factor alpha) transcription factors are 
the key mediators of the hypoxia response!. HIFa protein dysregula- 
tion in cancer can be triggered by mutation of the von-Hippel Lindau 
(VHL) gene, an event that hinders the negative control of HIFa protein 
stability through the ubiquitin ligase activity of VHL. This idea has 
been validated for HIF2qa, the HIF isoform preferentially upregulated 
in VHL-mutant tumours and which has recently been implicated as 
driver of cancer stem cells*°. However, signalling events that link the 
stem cell-intrinsic transcriptional machinery to pivotal mechanisms of 
HIF2q regulation in cancer remain to be charted. 

ID proteins (ID1 to ID4) are master regulators of stem cells that are 
hijacked during tumorigenesis and foster stem cell self-renewal and 
angiogenesis®’. Although the pro-tumorigenic role of ID proteins has 
been linked to the accumulation of mRNAs and proteins, it remains 
unclear whether other mechanisms exist that deregulate ID activity in 
cancer cells. Among ID proteins, ID2 is essential for tumour angiogen- 
esis and glioma stemness and it is a component of the signature that 
predicts poor outcome in patients with high-grade glioma*"”. 

We show that ID2 activity is restrained by DYRK1 kinase-mediated 
phosphorylation on Thr27 and hypoxia reduces this event by inhibiting 
DYRK_1 activity. In hypoxic brain tumour cells, active ID2 binds to and 
disrupts the VHL complex, thus preventing ubiquitin-mediated protea- 
somal degradation of HIF2a. This is a previously unrecognized mecha- 
nism that fosters cancer stem cells and aggressiveness of human cancer. 


Hypoxia regulates phosphorylation of ID2 by DYRK1 

We used mass spectrometry to identify the phosphorylation sites of 
human ID2. Beside Ser5 (ref. 11), we found that ID2 is phosphoryl- 
ated on Serl4 and Thr27 (Extended Data Fig. la—c). A sequencing 
analysis of the [D2 gene in cancer revealed that the colorectal cancer 
cell line HRT-18 harbours and expresses a mutant ID2(T27A) pro- 
tein (Extended Data Fig. 2a, b). Thr27 of ID2 is highly conserved 
throughout evolution (Extended Data Fig. 2c). The primary role 
of ID proteins is to preserve stem cell properties, a function widely 


documented in neural stem cells (NSCs)!*!°. Therefore, to interro- 
gate the significance of the ID2(T27A) mutation, we tested the self- 
renewing capacity of [D2-null NSCs reconstituted with wild-type ID2 
(ID2(WT)) or ID2(T27A) (Fig. 1a). Introduction of ID2(T27A) in 
ID2-null NSCs increased neurosphere formation in serial passages 
by more than 50% when compared with ID2(WT) (P = 0.00883- 
0.000229; t ratio = 4.772-12.597) and caused a 2.4-fold increase 
in cell expansion rate (40.5 + 1.7 versus 16.7 + 0.831; P< 0.0001, 
Fig. 1b-d). From the analysis of 18 candidate kinases, the dual- 
specificity tyrosine-phosphorylation-regulated kinases 1A and 1B 
(DYRKIA and DYRK1B) emerged as the only enzymes able to phos- 
phorylate Thr27 of ID2 (Fig. le, Extended Data Fig. 2d). The sequence 
surrounding the Thr27 residue in ID2 resembles the DYRK1 phos- 
phorylation consensus motif RX(X)(S/T)P and is highly conserved 
in different species (Extended Data Fig. 2c)'4. Antibodies against a 
phospho-T27-ID2 peptide confirmed that ID2 is phosphorylated 
by wild-type kinase but not the inactive DYRK1B(K140R) kinase 
(Fig. 1a, f-h)!5. Endogenous and exogenous ID2 and ID2(T27A) 
co-precipitated endogenous DYRKIA and DYRKIB (Fig. li and 
Extended Data Fig. 2e). Treatment of glioma cells with harmine, a 
small-molecule inhibitor of DYRK1 (ref. 16), or combined short hairpin 
RNA (shRNA)-mediated silencing of DYRK1A and DYRKIB reduced 
Thr27 phosphorylation of ID2 (Fig. 3e and Extended Data Fig. 2f). 
Next, we sought to identify the regulatory mechanisms controlling 
Thr27 phosphorylation of ID2. Exposure of human GBM-derived gli- 
oma stem cells (GSCs) to hypoxia or hypoxia-mimicking agent cobalt 
chloride (CoCl,) caused loss of Thr27 phosphorylation (Fig. 2a and 
Extended Data Fig. 3a). Determination of the Thr27 phosphoryla- 
tion stoichiometry of ID2 in the neuronal cell line SK-N-SN revealed 
that 21.08% of ID2 was phosphorylated on Thr27 in normoxia but 
the phosphorylation dropped to 2.28% in a hypoxic environment 
(Extended Data Fig. 3b, c). Mirroring the reduction of Thr27 phos- 
phorylation of ID2, CoCl, reduced DYRK1 kinase activity (Fig. 2b and 
Extended Data Fig. 3d) and DYRK1 auto-phosphorylation, an event 
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Figure 1 | Inhibition of DYRK1-mediated phosphorylation of ID2 at 
Thr27 promotes NSC properties. a, Reconstitution of NSCs ID2~'~ with 
ID2(WT), ID2(T27A) or the empty vector. b, Microphotographs of 
representative cultures from the neurosphere-forming assay. c, Percent 
of neurospheres generated in serial clonal assays (means of 3 biological 
replicates +s.d.; *** P = 0.00883-0.000229 for ID2~/~ ID2(T27A) 
compared with ID2~/~ ID2(WT)). d, Cumulative cell number of cultures 
as in c (means of 3 biological replicates +s.d. ***P = <0.0001 for ID2~/~ 


1D2(T27A) compared with ID2~/~ ID2(WT)). e, In vitro kinase assay 
shows phosphorylation of GST-ID2 protein by recombinant DYRK1B. 
f, Phosphorylation of ID2 but not ID2(T27A) by DYRK1A in IMR32 
cells. g, Phosphorylation of endogenous ID2 by DYRK1A in U87 cells. 
h, Phosphorylation of endogenous ID2 by DYRK1B but not the kinase 
inactive DYRK1B(K140R) in U87 cells. i, Binding between endogenous 
DYRK1A or DYRK1B and ID2. WCL, whole cellular lysate. 
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Figure 2 | DYRK1 kinases and ID2 Thr27 phosphorylation are inhibited 


by hypoxia and enhanced by PHD1. a, Hypoxia inhibits phosphorylation 
of ID2 Thr27 in GSC line 1123. b, CoCl, inhibits DYRK1 kinase. 
GFP-DYRK1 immunoprecipitates from U87 cells untreated or treated 
with CoC], and recombinant Flag-ID2 were used in kinase assay in vitro. 
c, d, Hypoxia reduces tyrosine phosphorylation of DYRK1A (c) and 
DYRK1B (d) as evaluated by anti-p-Tyr immunoprecipitation in GSC 

line 1123 cells. e, CoCl, inhibits proline hydroxylation of DYRKIA and 
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DYRK1B as shown by anti-hydroxyproline immunoprecipitation in U87 
glioma cells. (HC, IgG heavy chain; LC, IgG light chain). f, Endogenous 
DYRKIA and DYRKIB interact with Flag-PHD1 in U87 cells. g, The 
kinase domain (KD) but not the N- or the C-terminal domains of 
DYRK|B interacts with PHD1 in co-immunoprecipitation assay. 

h, Expression of PHD1 enhances cellular DYRK1 kinase activity in an 

in vitro phosphorylation assay using recombinant ID2. i, Expression of 
PHD1 enhances DYRK1 kinase activity towards ID2 Thr27 in vivo. 
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Figure 3 | The DYRK1-ID2-Thr27 pathway controls GSCs and 
HIF2a. a, Phosphorylation of ID2 but not ID2(T27A) by GFP-DYRK1B 
downregulates HIF2~ and SOX2 in GSC line 31. b, In vitro LDA of 
parallel cultures shows that decreased frequency of gliomaspheres 

by DYRK1B is rescued by ID2(T27A). Data are means of 3 biological 
replicates + s.d.; **P = 0.0031 (vector versus DYRK1B); ***P = 0.00022 
(DYRK1B versus DYRK1B plus ID2(T27A)). ¢, Microphotographs of 
representative cultures in b. d, Serial clonal experiments of cells in 

b. Data are means of 3 biological replicates + s.d. of percent gliomaspheres; 
* P = 0,00059-0.00007 for vector versus DYRK1B plus vector; 

* P = ().0089-0.0008 for ID2(T27A) plus DYRK1B versus DYRK1B plus 


required for the activity of DYRK1 kinase’” (Extended Data Fig. 3e-g). 
Similarly, exposure of GSCs to low oxygen decreased DYRK1A and 
DYRK1B tyrosine auto-phosphorylation (Fig. 2c, d). 

Prolyl hydroxylases PHD1, PHD2 and PHD3 operate as direct 
sensors of cellular oxygen concentration!*!°, Immunoprecipitation 
using an antibody that recognizes hydroxyprolines indicated that 
DYRK1A and DYRKIB carry hydroxylated prolines, and CoC], abro- 
gated DYRK1 prolyl hydroxylation (Fig. 2e). DYRK1A and DYRK1B 
interacted in vivo with PHD1 (Fig. 2f) and the expression of PHD1 
enhanced prolyl hydroxylation of both kinases (Extended Data 
Fig. 3h). In particular, DYRK1B interacted with PHD1 through the 
kinase domain (Fig. 2g). The activity of DYRK1A and DYRK1B 
towards Thr27 of ID2 was potentiated by PHD1 in vitro and in vivo 
(Fig. 2h, i). Thus, oxygen deprivation induces a constitutively active 
ID2 by inactivating DYRK1 kinases, which are positively regulated 
substrates of PHD1. 


Phosphorylation of ID2 by DYRK1 destabilizes HIF2a 

We used human GSCs to interrogate the effects of DYRK1 and 
ID2(T27A) on HIF2a and glioma stemness. Lentiviral transduction 
of the DYRK1-resistant ID2(T27A) mutant in GSC line number 48 
resulted in elevation of HIF2a and enhanced tumour sphere form- 
ing capacity as measured by limiting dilution assay (LDA, Extended 
Data Fig. 4a—d). ID2(T27A)-induced accumulation of the HIF2a 
protein was independent of transcription (Extended Data Fig. 4e). 
When detectable, HIF1« levels mirrored those of HIF2« but with more 
limited changes (Extended Data Fig. 4f). 
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vector; NS: P=0.061-0.249 for ID2(T27A) plus DYRK1B versus vector. 
e, Silencing of DYRK1 downregulates phospho-Thr27 of ID2 and 
increases HIF2a in U87 cells. f, Ubiquitylation of HIF2a is enhanced by 
DYRK1B and reduced by ID2(T27A) as evaluated by in vivo ubiquitylation 
(left panels, MYC-Ub immunoprecipitation/HA-HIF2«a western blot; 
right panels, whole cellular lysates, WCL). g, ID2(T27A) elevates HIF2a 
and opposes DYRK1B-mediated reduction of HIF2a during hypoxia. 

h, ID2(T27A) reverts DYRK1B-mediated decrease of HIF2« half-life 
during recovery from exposure to CoCl;. U87 were exposed to CoC], for 
3h followed by CoCl, washout. i, Quantification of HIF2q protein from 
the experiment in h. 


Expression of DYRK1 in GSC line number 34 and GSC line number 
31 reduced HIF2a, the HIF2a target TGFa and the glioma stem cell 
marker SOX2 (Fig. 3a and Extended Data Fig. 4g, h). Also in this case 
HIF2a mRNA was unchanged (Extended Data Fig. 4h, j). LDA and 
serial clonal experiments showed that the DYRK1-induced decrease 
of HIF2a attenuated glioma stemness (Fig. 3b-d and Extended Data 
Fig. 4k). However, accumulation of HIF2a, expression of SOX2 and 
the frequency of GSCs were restored by co-expression of DYRK1 and 
1D2(T27A) but not ID2(WT) (Fig. 3a—-d and Extended Data Fig. 4k). 
DYRK1-mediated inhibition of gliomasphere formation was overridden 
by co-expression of non-degradable HIF2a (HIF2a-TM, TM, triple 
mutant; Extended Data Fig. 41)?°. Furthermore, silencing of DYRK1A 
or DYRK1B upregulated HIF2q and reduced phosphorylation of 
Thr27 of ID2, with maximal effects after co-silencing of both kinases 
(Fig. 3e). Next, we investigated the effects of DYRK1 and ID2(T27A) 
on ubiquitylation and the stability of HIFa. DYRK1-mediated phos- 
phorylation of Thr27 triggered HIF ubiquitylation and expression of 
ID2(T27A) reversed DYRK1 effect (Fig. 3f and Extended Data Fig. 5a). 
Similarly, expression of DYRK1B prevented accumulation of HIF2a 
under hypoxia and co-expression of ID2(T27A) abrogated this response 
(Fig. 3g). DYRK1 accelerated the decay of HIF2« during recovery from 
exposure to CoCl, and reduced HIF2« half-life and ID2(T27A) coun- 
teracted these effects (Fig. 3h, iand Extended Data Fig. 5b, c). 


ID2 binds and disrupts the VCB-Cul2 complex 
Mass spectrometry analysis of ID2 immunoaffinity complexes revealed 
that Elongin C, a component of the VCB-Cul2 ubiquitin ligase 
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Figure 4 | Unphosphorylated ID2 binds to and disrupts the VCB-Cul2 
complex. a, Recombinant [D2 interacts with Elongin C and VHL in a GST 
pulldown assay (input: 10%). b, Co-immunoprecipitation shows that VHL 
preferentially interacts with ID2. c, The N terminus of ID2 is required 

for the interaction with VHL as determined by immunoprecipitation 

and western blot (IP-WB) analysis. d, The N terminus of GST-ID2 is 
required for the interaction with in vitro translated HA-VHL. e, Amino 
acids 154-174 of in vitro translated HA-VHL mediate the interaction 
with GST-ID2 (input: 10%). f, Phosphorylation of ID2 Thr27 or the 
ID2(T27W) mutation disrupt the ID2-VHL interaction as analysed 

by in vitro streptavidin pull down of biotinylated ID2 peptides in the 


complex that includes VHL, Elongin C, Elongin B, Cullin 2, and RBX1 
is an ID2-associated protein (Supplementary Tables 1 and 2). The direct 
interaction of ID2 with Elongin C and VHL was confirmed in vitro and 
in vivo (Fig. 4a and Extended Data Fig. 5d, e). ID2 was unable to bind to 
HIFa proteins (Extended Data Fig. 5d). VHL and Elongin C interacted 
strongly with ID2, weakly with ID1 and ID3, and did not bind to ID4 
(Fig. 4b and Extended Data Fig. 5f, g). The interaction between ID2 and 
VHL was mediated by the amino-terminal region of ID2 that includes 
Thr27 and did not require the HLH domain (amino acids 35-76, 
Fig. 4c). A more detailed mapping of the regions involved in the VHL- 
ID2 interaction revealed that amino acids 15-35 of ID2 and the SOCS 
box of VHL (amino acids 154-174) were required for the binding 
(Fig. 4d, e). Mutation of Lys159 (K159E), which provides the VHL 
contact surface for binding to Cul2 (refs 21, 22), impaired the inter- 
action with ID2 (Fig. 4e). Consistent with ID2 deletion mapping 
analysis, an [D2 peptide composed of amino acids 14-34 bound to 
both VHL and components of the VCB complex pre-assembled in 
insect cells. However, addition of a phosphate to Thr27 or mutation 
of Thr27 to a bulky hydrophobic amino acid (T27W) prevented the 
binding to both VHL and the VCB complex (Fig. 4f and Extended Data 
Fig. 5h). These findings were corroborated by computational molecular 
docking whereby a N-terminally-derived ID2 peptide (amino acids 
15-31) docked preferentially to a groove on the molecular surface of 
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B-actin 


presence of recombinant VCB-Cul2. g, Co-immunoprecipitation shows 
that DYRK1B-mediated phosphorylation of ID2 disrupts ID2 interaction 
with VHL in vivo. h, In vitro phosphorylation of recombinant ID2 by 
purified DYRK1B blocks ID2 interaction with VHL and Elongin C in the 
reconstituted VCB-Cul2 complex. i, ID2(T27A) displaces Cul2 from VCB 
complex in co-immunoprecipitation assays in U87 cells. j, Progressive 
dissociation of Cul2 from recombinant VCB complex by increasing 
concentration of purified Flag-ID2. k, Silencing of ID2 in U87 reverts 
CoClz-mediated dissociation of Cul2 from VHL as evaluated by Co-IP-WB. 
WCL, whole cellular lysates. 


VHL:Elongin C” with the N-terminal half of its interaction surface 
contacting the SOCS box of VHL that binds Cul2 (primarily Lys159) 
and the C-terminal half (including Thr27) fitting snugly into a hydro- 
phobic pocket mostly contributed by the Elongin C surface (Extended 
Data Fig. 6a, c, d). Mutating Thr27 to phospho-Thr27 and re-dock- 
ing resulted in unfavourable energy and displacement of the peptide 
from this location on the complex (Extended Data Fig. 6b). DYRK1 
disrupted the interaction between VHL and ID2(WT) but did not 
affect the binding of VHL to ID2(T27A) (Fig. 4g). The phosphomimic 
ID2(T27E) mutant failed to bind VHL and did not promote accumu- 
lation of HIF2a (Extended Data Fig. 7a, b). Next, we developed an 
in vitro assay using purified proteins that included bacterially expressed 
Flag-ID2, enzymatically active recombinant DYRK1B and baculovi- 
rus-expressed VCB-Cul2 complex or purified VHL. In this system, 
ID2 bound to VCB-Cul2 complex and VHL in the absence of active 
DYRKIB, but the interaction was disrupted by DYRK1-mediated phos- 
phorylation of Thr27 (Fig. 4h and Extended Data Fig. 7c). 

In the VCB-Cul2 complex, the Cul2 subunit provides the scaffold 
module for the interaction with the ubiquitin-conjugating enzyme 
(E2)??4, As we expressed ID2(T27A), ID2 was loaded onto VCB, and 
the Cul2/RBX1 module dissociated from the complex in the absence of 
changes in the total cellular levels of Cul2 and RBX1 (Fig. 4i). We also 
found that challenging the pre-assembled VCB complex with increasing 
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Figure 5 | DYRK1 kinases inhibit human glioma growth by repressing 
an ID2-HIF2c network. a, Significant and positive targets of HIF2a 
correlate with HIF2a in GBM with high ID2 activity compared to 

a set of random genes by GSEA; b, Correlations are not significant 

in GBM with low ID2 activity c, Inducible expression of DYRK1B 

in U87 causes ID2 Thr27 phosphorylation and downregulation of 
HIF2a. d, (RT-PCR from cells treated as in c; n=9 (3 biological 
replicates performed in triplicates) + s.d. BMI1: *P = 0.0470; PI3KCA: 
*P=0.0279; FLT1: *P = 0.0246; POUSF-1: ***P =0.000796344; NANOG: 
** P = 0000737396; SOX2: *P = 0.028884239 (DYRK1B minus 

Dox versus DYRK1B plus Dox). e, Inducible expression of DYRK1B 


amounts of recombinant Flag-ID2 resulted in progressive dissocia- 
tion of Cul2 (Fig. 4j). Expression of ID2(T27A) triggered a compara- 
ble block of Elongin C-Cul2 association whereas it did not affect the 
assembly of a Cul5-based complex containing SOCS2, a SOCS protein 
that cannot bind to ID2(T27A) (Extended Data Fig. 7d, e). Finally, in 
glioma cells in which ID2 and VHL are present at a molar ratio of 5.7:1 
(Extended Data Fig. 7f), hypoxia signalling promoted the association 
between endogenous VHL and ID2 while dissociating Cul2 (Fig. 4k 
and Extended Data Fig. 7g). Silencing of ID2 rescued the dissociation 
of Cul2 from VCB complex and prevented HIF2q elevation (Fig. 4k). 
Together, these findings indicate that ID2 activation stabilizes HIF2a 
by disabling VCB ubiquitin ligase via dissociation of Cul2. 


A DYRK1-ID2 pathway controls HIF2o in glioma 

To determine whether activation of ID2 enhances HIF2«a transcrip- 
tional activity in an unbiased fashion, we used CINDy, an algorithm 
for high-fidelity reconstruction of post-translational causal dependen- 
cies to interrogate whether ID2 can affect the activity of HIF2a on its 
targets in the context of GBM”*. When applied to a collection of 548 
TCGA-derived GBM samples, ID2 activity emerged as the modula- 
tor of the transcriptional connection between HIF2« and its activated 
target genes (Fig. 5a, b). The activity of ID2 is estimated by the VIPER 
algorithm, a computational tool designed to infer protein activity from 
gene expression data”®. When GBM samples were divided into two 
groups based on ID2 activity, samples with higher ID2 activity showed 
significantly stronger correlation between HIF2« and its targets than a 
set of random genes (P= 0.001, Fig. 5a). This positive correlation was 
absent in the cohort of GBM with low ID2 activity (P=0.093, Fig. 5b). 
Consistent with these observations, we detected a marked reduction 
of HIF2ca protein following acute deletion of the Jd1 and Id2 genes ina 
mouse model of malignant glioma (Extended Data Fig. 8a)°. We inter- 
rogated the effect of DYRK1 expression in mouse models of human 
glioma. Tetracycline-induced expression of DYRK1B at levels compa- 
rable to normal brain (Extended Data Fig. 8b), downregulated HIF2a 
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downregulates HIF2a in subcutaneous xenografts of U87 cells as 

indicated by immunostaining. f, Inducible expression of DYRK1B causes 
tumour growth inhibition in mice treated as in e. Doxycycline (Dox) 
treatment starts at day 0 (n =7 mice per group; **: P=0.0040 and 0.0069, 
DYRK1B - Dox versus DYRK1B + Dox at day 4 and day 5, respectively). 

g, Expression of DYRK1B(WT) but not DYRK1B(K140R) inhibits orthotopic 
growth of U87 (haematoxylin & eosin staining of brain cross-sections). 
Mice injected with U87-vector or DYRK1B(K140R) were euthanized on 
day 25. Mice injected with U87-DYRK1B were euthanized on day 70. 

h, Kaplan-Meier analysis of mice in g (n =7 animals per group). 

i, Expression of DYRK1A and DYRK1B predicts survival in GBM patients. 


in glioma cells in vitro and in sub-cutaneous xenografts and reduced 
the expression of the HIF2a targets that promote stem cell functions 
(Fig. 5c-e). Expression of DYRK1B also inhibited tumour cell prolifer- 
ation in vivo, resulting in tumour reduction (Fig. 5fand Extended Data 
Fig. 8c, d). Next, we evaluated the anti-tumour effects of DYRK1B(WT) 
or the kinase inactive K140R mutant in an orthotopic model of glioma 
(Extended Data Fig. 8e). Animals bearing glioma cells that expressed 
DYRK1B(WT) manifested significantly increased survival and tumour 
latency relative to mice bearing DYRK1B(K140R) or vector trans- 
duced cells (Fig. 5g, h). Two out of seven mice in the DYRK1B(WT) 
group developed tumours that failed to express exogenous DYRK1B 
(Extended Data Fig. 8f). This result suggests that active DYRK1 kinase 
is incompatible with tumour growth in this glioma model. Finally, 
higher DYRK1A and DYRK1B predicted a more favourable clinical 
outcome for GBM patients, thus supporting the clinical significance of 
DYRK1 activity in glioma (Fig. 5i and Extended Data Fig. 9a). 


Discussion 
Here we report a novel mechanism of functional inactivation of the 
VHL ubiquitin ligase that is independent of genetic mutations of the 
VHL gene. This mechanism is centred on the ability of active [D2 to 
disrupt the VCB-Cul2 complex, leading to HIF2a stabilization. We 
also unravel a hypoxia-directed cascade of events that by overriding 
the restraining effect of DYRK1-mediated phosphorylation of ID2 on 
Thr27, culminates with ID2 activation and HIF2a stabilization. The 
transcriptional activation of the ID2 gene, a HIFa target?’, by HIF2a 
generates a feed-forward ID2-HIF20a loop that further supports cancer 
stem cells and glioma aggressiveness (Extended Data Fig. 10). By show- 
ing that PHD1-mediated prolyl hydoxylation enhances the enzymatic 
activity of DYRK1 kinases towards ID2, our findings provide a clue to 
the mechanism of DYRK1 inhibition in hypoxia. Thus, inhibition of 
DYRK1 kinases is an oxygen-sensing signal that disables VCB-Cul2. 
The gene coding for DYRK1A is gained in Down syndrome, 
a disease characterized by impaired neural proliferation during 
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development”*, reduced self-renewal and premature withdrawal 


from the cell cycle*'. These findings are consistent with the notion that 
activation of DYRK1A and DYRK1B inhibits proliferation and activates 
acellular quiescence program**’, It is plausible that DYRK1-activating 
signals are negative regulators of self-renewal of both normal and can- 
cer stem cells. By phosphorylating ID2 on Thr27, DYRK1 weakens the 
core of the stemness machinery centred on the ID2-HIF2« pathway. 
We suggest that together with the control of ID2 activity, regulation 
of sonic hedgehog/Gli signalling and CREB-mediated transcription 
by DYRK kinases cooperate to restrain stem cell functions and tum- 
origenesis in the nervous system**“°. The glioma inhibitory effect of 
DYRK1 in the mouse, and the predictive value of DYRK1 expression 
on GBM patient survival support the tumour suppressor function of 
DYRKI kinases in vivo. Whereas there is agreement on the negative 
role of DYRK1 towards stemness and cell proliferation?!», the broad 
spectrum of DYRK kinase substrates may account for the pro-tumori- 
genic functions of DYRK1 reported by other studies*!. It has been pro- 
posed that loss of DYRK1 and accumulation of ID2 and HIF2a drive 
tumour progression of other cancer types beside GBM. The mechanism 
reported here might operate widely during the transition of cancer cells 
towards the more aggressive stem-like state that drives maintenance 
and progression of solid tumours. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Data reporting. No statistical methods were used to predetermine sample size. 
The investigators were not blinded to allocation during experiments and outcome 
assessment. 
Plasmids, cloning and lentivirus production. A constitutively stabilized 
mutant of HIF2« (HIF2a-TM) was obtained from Christina Warnecke”’. 
The HIF2a-TM (triple mutant) construct harbours the following mutations 
in the prolyl and asparagyl hydroxylation sites: P405A, P530G and N851A. 
Polypeptide fragments of DYRK1B were cloned into pcDNA3-HA and include 
DYRKIB N terminus, N-Ter (amino acids 1-110), DYRK1B kinase domain, 
KD (amino acids 111-431), and DYRK1B C terminus, C-Ter (amino acids 
432-629). cDNAs for RBX1, Elongin B and Elongin C were kindly provided 
from Michele Pagano (New York University) and cloned into the pcDNA 
vector by PCR. HA-tagged HIFla and HIF2a were obtained from Addgene. 
GFP-tagged DYRK1A and DYRK1B were cloned into pcDNA vector. 
pcDNA-HA-VHL was provided by Kook Hwan Kim (Sungkyunkwan University 
School of Medicine, Korea). Site-directed mutagenesis was performed using 
QuickChange or QuickChange Multi Site-Directed mutagenesis kit (Agilent) 
and resulting plasmids were sequence verified. Lentivirus was generated by 
co-transfection of the lentiviral vectors with pCMV-AR8.1 and pMD2.G plas- 
mids into HEK293T cells as previously described”. SARNA sequences are: ID2-1: 
GCCTACTGAATGCTGTGTATACTCGAGTATACACAGCATTCAGTAGGC; 
ID2-2: CCCACTATTGTCAGCCTGCATCTCGAGATGCAGGCTGACA 
ATAGTGGG; DYRK1A: CAGGTTGTAAAGGCATATGATCTCGAGATC 
ATATGCCTTTACAACCTG; DYRK1B: GACCTACAAGCACATCAATGA 
CTCGAGTCATTGATGTGCTTGTAGGTC, 
Cell culture and hypoxia induction. IMR-32 (ATCC CCL-127), SK-N-SH (ATCC 
HTB-11), U87 (ATCC HTB-14), NCI-H1299 (ATCC CRL-5803), HRT18 (ATCC 
CCL-244), and HEK293T (ATCC CRL-11268) cell lines were acquired through 
American Type Culture Collection. U251 (Sigma, catalogue number 09063001) 
cell line was obtained through Sigma. Cell lines were cultured in DMEM supple- 
mented with 10% fetal bovine serum (FBS, Sigma). Cells were routinely tested 
for mycoplasma contamination using Mycoplasma Plus PCR Primer Set (Agilent, 
Santa Clara, CA) and were found to be negative. Cells were transfected with 
Lipofectamine 2000 (Invitrogen) or calcium phosphate. Mouse NSCs were grown 
in Neurocult medium (StemCell Technologies) containing 1 x proliferation sup- 
plements (StemCell Technologies), and recombinant FGF-2 and EGF (20ng ml"! 
each; Peprotech). GBM-derived glioma stem cells were obtained by de-identified 
brain tumour specimens from excess material collected for clinical purposes at 
New York Presbyterian-Columbia University Medical Center. Donors (patients 
diagnosed with glioblastoma) were anonymous. Progressive numbers were used 
to label specimens coded in order to preserve the confidentiality of the subjects. 
Work with these materials was designated as IRB exempt under paragraph 4 and it 
is covered under IRB protocol #IRB-AAAI7305. GBM-derived GSCs were grown 
in DMEM:F12 containing 1x N2 and B27 supplements (Invitrogen) and human 
recombinant FGF-2 and EGF (20ng ml! each; Peprotech). Cells at passage (P) 4 
were transduced using lentiviral particle in medium containing 4|.g ml! of poly- 
brene (Sigma). Cells were cultured in hypoxic chamber with 1% O, (O Control 
Glove Box, Coy Laboratory Products, MI) for the indicated times or treated with 
a final concentration of 100-300 |1M CoC], (Sigma) as specified in figure legends. 
Mouse neurosphere assay was performed by plating 2,000 cells in 35 mm dishes 
in collagen containing NSC medium to ensure that distinct colonies were derived 
from single cells and therefore clonal in origin**. We determined neurosphere 
formation over serial clonal passages in limiting dilution semi-solid cultures and 
the cell expansion rate over passages, which is considered a direct indication of 
self-renewing symmetric cell divisions“. For serial sub-culturing we mechanically 
dissociated neurospheres into single cells in bulk and re-cultured them under the 
same conditions for six passages. The number of spheres was scored after 14 days. 
Only colonies >100|1m in diameter were counted as spheres. Neurosphere size was 
determined by measuring the diameters of individual neurospheres under light 
microscopy. Data are presented as percent of neurospheres obtained at each passage 
(number of neurospheres scored/number of NSCs plated x 100) in three independ- 
ent experiments. P value was calculated using a multiple t-test with Holm-Sidak 
correction for multiple comparisons. To determine the expansion rate, we plated 
10,000 cells from 3 independent P1 clonal assays in 35 mm dishes and scored the 
number of viable cells after 7 days by Trypan Blue exclusion. Expansion rate of 
NSCs was determined using a linear regression model and difference in the slopes 
(P value) was determined by the analysis of covariance (ANCOVA) using Prism 
6.0 (GraphPad). Limiting dilution assay (LDA) for human GSCs was performed as 
described previously. Briefly, spheres were dissociated into single cells and plated 
into 96-well plates in 0.2 ml of medium containing growth factors at increasing 
densities (1-100 cells per well) in triplicate. Cultures were left undisturbed for 


14 days, and then the percent of wells not containing spheres for each cell dilution 
was calculated and plotted against the number of cells per well. Linear regres- 
sion lines were plotted, and we estimated the minimal frequency of glioma cells 
endowed with stem cell capacity (the number of cells required to generate at least 
one sphere in every well = the stem cell frequency) based on the Poisson distribu- 
tion and the intersection at the 37% level using Prism 6.0 software. Data represent 
the means of three independent experiments performed in different days for the 
evaluation of the effects of ID2, ID2(T27A) in the presence or in the absence of 
DYRK1B. LDA for the undegradable HIF2« rescue experiment was performed by 
using three cultures transduced independently on the same day. 

Identification of phosphorylation sites of ID2. To identify the sites of ID2 phos- 
phorylation from IMR32 human neuroblastoma cells, the immunoprecipitated ID2 
protein was excised, digested with trypsin, chymotrypsin and Lys-C and the pep- 
tides extracted from the polyacrylamide in two 30 1] aliquots of 50% acetonitrile/ 
5% formic acid. These extracts were combined and evaporated to 25 1] for MS 
analysis. The LC-MS system consisted of a state-of-the-art Finnigan LTQ-FT 
mass spectrometer system with a Protana nanospray ion source interfaced to 
a self-packed 8 cm x 75 1m id Phenomenex Jupiter 10j1m C18 reversed-phase 
capillary column. 0.5-5 11 volumes of the extract were injected and the peptides 
eluted from the column by an acetonitrile/0.1 M acetic acid gradient at a flow 
rate of 0.25lmin~!, The nanospray ion source was operated at 2.8 kV. The digest 
was analysed using the double play capability of the instrument acquiring full 
scan mass spectra to determine peptide molecular weights and product ion spec- 
tra to determine amino acid sequence in sequential scans. This mode of analysis 
produces approximately 1200 CAD spectra of ions ranging in abundance over 
several orders of magnitude. Tandem MS/MS experiments were performed on 
each candidate phosphopeptide to verify its sequence and locate the phosphoryl- 
ation site. A signature of a phosphopeptide is the detection of loss of 98 daltons 
(the mass of phosphoric acid) in the MS/MS spectrum. With this method, three 
phosphopeptides were found to carry phosphorylations at residues Ser5, Ser14 
and Thr27 of the ID2 protein. 

Generation of phospho-ID2-T27 antibody. The anti-phospho-T27-ID2 antibody 
was generated by immunizing rabbits with a short synthetic peptide containing 
the phosphorylated T27 (CGISRSK-pT-PVDDPMS) (Yenzym Antibodies, LLC). 
A two-step purification process was applied. First, antiserum was cross-absorbed 
against the phospho-peptide matrix to purify antibodies that recognize the phospho- 
rylated peptide. Then, the anti-serum was purified against the un-phosphorylated 
peptide matrix to remove non-specific antibodies. 

Immunoblot, immunoprecipitation and in vitro binding assay. Cells were lysed 
in NP40 lysis buffer (50 mM Tris-HCl, pH 7.5, 150mM NaCl, 1mM EDTA, 1% 
NP40, 1.5mM Na3VOq, 50mM sodium fluoride, 10 mM sodium pyrophosphate, 
10mM 8-glycerolphosphate and EDTA-free protease inhibitor cocktail (Roche)) or 
RIPA buffer (50 mM Tris-HCl, pH 7.5, 150mM NaCl, 1mM EDTA, 1% NP40, 0.5% 
sodium dexoycholate, 0.1% sodium dodecyl sulphate, 1.5 mM Na3VOq, 50 mM 
sodium fluoride, 10mM sodium pyrophosphate, 10 mM 8-glycerolphosphate and 
EDTA-free protease inhibitor cocktail (Roche)). Lysates were cleared by centrif- 
ugation at 15,000r.p.m. for 15 min at 4°C. For immunoprecipitation, cell lysates 
were incubated with primary antibody (hydroxyproline, Abcam, ab37067; VHL, 
BD, 556347; DYRK1A, Cell Signaling Technology, 2771; DYRK1B, Cell Signaling 
Technology, 5672) and protein G/A beads (Santa Cruz, sc-2003) or phospho- 
Tyrosine (P-Tyr-100) Sepharose beads (Cell Signaling Technology, 9419), HA affin- 
ity matrix (Roche, 11815016001), Flag M2 affinity gel (Sigma, F2426) at 4°C over- 
night. Beads were washed with lysis buffer four times and eluted in 2x SDS sample 
buffer. Protein samples were separated by SDS-PAGE and transferred to polyvinyl 
difluoride (PVDF) or nitrocellulose (NC) membrane. Membranes were blocked in 
TBS with 5% non-fat milk and 0.1% Tween20, and probed with primary antibodies. 
Antibodies and working concentrations are: [D2 1:500 (C-20, sc-489), GFP 1:1,000 
(B-2, sc-9996), HIF2a/EPAS-1 1:250 (190b, sc-13596), c-MYC (9E10, sc-40), and 
Elongin B 1:1,000 (FL-118, sc-11447), obtained from Santa Cruz Biotechnology; 
phospho-Tyrosine 1:1,000 (P-Tyr-100, 9411), HA 1:1,000 (C29F4, 3724), VHL 
1:500 (2738), DYRK1A 1:1,000, 2771; DYRK1B 1:1,000, 5672) and RBX1 1:2,000 
(D3J5I, 11922), obtained from Cell Signaling Technology; VHL 1:500 (GeneTex, 
GTX101087); 8-actin 1:8000 (A5441), a-tubulin 1:8,000 (T5168), and Flag M2 
1:500 (F1804) obtained from Sigma; HIF1a 1:500 (H1lalpha67, NB100-105) and 
Elongin C 1:1,000 (NB100-78353) obtained from Novus Biologicals; HA 1:1000 
(3F10, 12158167001) obtained from Roche. Secondary antibodies horseradish- 
peroxidase-conjugated were purchased from Pierce and ECL solution (Amersham) 
was used for detection. 

For in vitro binding assays, HA-tagged RBX1, Elongin B, Elongin C and VHL 
were in vitro translated using TNT quick coupled transcription/translation system 
(Promega). Active VHL protein complex was purchased from EMD Millipore. 
Purified His- VHL protein was purchased from ProteinOne (Rockville, MD). 
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GST, GST-ID2 and Flag-ID2 proteins were bacterial expressed and purified 
using glutathione sepharose beads (GE healthcare life science). Active DYRK1B 
(Invitrogen) was used for in vitro phosphorylation of Flag-ID2 proteins. 
Biotinylated wild-type and modified (pT27 and T27W) ID2 peptides (amino acids 
14-34) were synthesized by LifeTein (Somerset, NJ). In vitro binding experiments 
between ID2 and VCB-Cul2 were performed using 500 ng of Flag-ID2 and 500ng 
of VCB-Cul2 complex or 500ng VHL protein in binding buffer (50 mM Tris-Cl, 
pH7.5, 100mM NaCl, 1mM EDTA, 10mM {-glycerophosphate, 10mM sodium 
pyrophosphate, 50 mM sodium fluoride, 1.5 mM Na3VOq, 0.2% NP40, 10% glyc- 
erol, 0.1 mg ml! BSA and EDTA-free protease inhibitor cocktail (Roche)) at 4°C 
for 3h. In vitro binding between ID2 peptides and purified proteins was performed 
using 21g of ID2 peptides and 200 ng of recombinant VCB-Cul2 complex or 200 ng 
recombinant VHL in binding buffer (50mM Tris-Cl, pH 7.5, 100 mM NaCl, 1 mM 
EDTA, 10mM 8-glycerophosphate, 10mM sodium pyrophosphate, 50 mM sodium 
fluoride, 1.5 mM Na3VOq, 0.4% NP40, 10% glycerol, 0.1 mgml | BSA and EDTA- 
free protease inhibitor cocktail (Roche)) at 4°C for 3h or overnight. Protein com- 
plexes were pulled down using glutathione sepharose beads (GE Healthcare Life 
Science) or streptavidin conjugated beads (Thermo Fisher Scientific) and analysed 
by immunoblot. 

In vitro and in vivo kinase assays. Cdk1, Cdk5, DYRK1A, DYRK1B, ERK, GSK3, 
PKA, CaMKII, Chk1, Chk2, RSK-1, RSK-2, aurora-A, aurora-B, PLK-1, PLK-2, 
and NEK2 were all purchased from Life Technology and ATM from EMD 
Millipore. The 18 protein kinases tested in the survey were selected because they 
are proline-directed S/T kinases (Cdk1, Cdk5, DYRK1A, DYRK1B, ERK) and/or 
because they were considered to be candidate kinases for Thr27, Ser14 or Ser5 
from kinase consensus prediction algorithms (NetPhosK1.0, http://www.cbs.dtu. 
dk/services/NetPhosK/; GPS Version 3.0 http://gps.biocuckoo.org/#) or visual 
inspection of the flanking regions and review of the literature for consensus kinase 
phosphorylation motifs. 1 1g of bacterially purified GST-ID substrates were incu- 
bated with 10-20 ng each of the recombinant active kinases. The reaction mixture 
included 10\.Ci of [)-*?P]ATP (PerkinElmer Life Sciences) in 50 11 of kinase buffer 
(25 mM Tris-HCl, pH 7.5, 5 mM (-glycerophosphate, 2mM dithiothreitol (DTT), 
0.1mM Na3VO,, 10mM MgCh, and 0.2mM ATP). Reactions were incubated at 
30°C for 30 min. Reactions were terminated by addition of Laemmli SDS sample 
buffer and boiling on 95°C for 5 min. Proteins were separated on SDS-PAGE gel 
and phosphorylation of proteins was visualized by autoradiography. Coomassie 
staining was used to document the amount of substrates included in the kinase 
reaction. In vitro phosphorylation of Flag- ID2 proteins by DYRK1B (Invitrogen) 
was performed using 500 ng of GST-DYRK1B and 200 ng of bacterially expressed 
purified Flag-ID2 protein. 

In vivo kinase assay in GSCs and glioma cells was performed using endoge- 
nous or exogenously expressed DYRK1A and DYRK1B. Cell lysates were prepared 
in lysis buffer (50mM Tris-HCl, pH 7.5, 150 mM NaCl, 1mM EDTA, 1% NP40, 
1.5mM Na3VO,, 50mM sodium fluoride, 10 mM sodium pyrophosphate, 10 mM 
68-glycerolphosphate and EDTA-free protease inhibitor cocktail (Roche)). DYRK1 
kinases were immunoprecipitated using DYRK1A and DYRK1B antibodies (for 
endogenous DYRK] proteins) or GFP antibody (for exogenous GFP-DYRK1 pro- 
teins) from 1 mg cellular lysates at 4°C. Immunoprecipitates were washed with 
lysis buffer four times followed by two washes in kinase buffer as described above 
and incubated with 200 ng purified Flag-ID2 protein in kinase buffer for 30 min 
at 30°C. Kinase reactions were separated by SDS-PAGE and analysed by western 
blot using p-T27-ID2 antibody. 

Protein half-life and stoichiometry. HIF2c half-life was quantified using ImageJ 
processing software (NIH). Densitometry values were analysed by Prism 6.0 using 
the linear regression function. Stoichiometric quantification of ID2 and VHL in 
U87 cells was obtained using recombinant Flag-ID2 and His-tagged-VHL as 
references. The chemiluminescent signal of serial dilutions of the recombinant 
proteins was quantified using Image], plotted to generate a linear standard curve 
against which the densitometric signal generated by serial dilutions of cellular 
lysates (1 x 10° U87 cells) was calculated. Triplicate values + s.e.m. were used to 
estimate the ID2:VHL ratio per cell. The stoichiometry of pT27-ID2 phosphoryl- 
ation was determined as described“. Briefly, SK-N-SH cells were plated at density 
of 1 x 10° in 100 mm dishes. Forty-eight hours later 1.5 mg of cellular lysates from 
cells untreated or treated with CoCl, during the previous 24h were prepared in 
RIPA buffer and immunoprecipitated using 41g of pT27-ID2 antibody or rabbit 
IgG overnight at 4°C. Immune complexes were collected with TrueBlot anti-rabbit 
IgG beads (Rockland), washed 5 times in lysis buffer, and eluted in SDS sample 
buffer. Serial dilutions of cellular lysates, IgG and pT27-ID2 immunoprecipitates 
were loaded as duplicate series for SDS-PAGE and western blot analysis using ID2 
or p-T27-ID2 antibodies. Densitometry quantification of the chemiluminescent 
signals was used to determine (1) the efficiency of the immunoprecipitation using 
the antibody against p-ID2-T27 and (2) the ratio between efficiency of the immu- 
noprecipitation evaluated by western blot for p-T27-ID2 and total ID2 antibodies. 
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This represents the percent of phosphorylated Thr27 of ID2 present in the cell 
preparation. 

Identification of ID2 complexes by mass spectrometry. Cellular ID2 complexes 
were purified from the cell line NCI-H1299 stably engineered to express Flag-HA- 
ID2. Cellular lysates were prepared in 50 mM Tris-HCl, 250 mM NaCl, 0.2% NP40, 
1mM EDTA, 10% glycerol, protease and phosphatase inhibitors. Flag-HA-ID2 
immunoprecipitates were recovered first with anti-Flag antibody-conjugated 
M2 agarose (Sigma) and washed with lysis buffer containing 300 mM NaCl and 
0.3% NP40. Bound polypeptides were eluted with Flag peptide and further affin- 
ity purified by anti-HA antibody-conjugated agarose (Roche). The eluates from 
the HA beads were analysed directly on long gradient reverse phase LC-MS/ 
MS. A specificity score of proteins interacting with ID2 was computed for each 
polypeptide by comparing the number of peptides identified from mass spec- 
trometry analysis to those reported in the CRAPome database that includes a list 
of potential contaminants from affinity purification-mass spectrometry experi- 
ments (http://www.crapome.org). The specificity score is computed as [(#peptide* 
#xcorr)/(AveSC* MaxSC* # of Expt.)], #peptide, identified peptide count; #xcorr, 
the cross-correlation score for all candidate peptides queried from the database; 
AveSC, averaged spectral counts from CRAPome; MaxSC, maximal spectral counts 
from CRAPome; and # of Expt., the total found number of experiments from 
CRAPome. 

Ubiquitylation assay. U87 cells were transfected with pcDNA3-HA-HIFa 
(HIF1a or HIF2a), pcDNA3-Flag-ID2 (WT or T27A), pEGFP-DYRK1B and 
pcDNA3-Myc-Ubiquitin. 36h after transfection, cells were treated with 201M 
MG132 (EMD Millipore) for 6h. After washing with ice-cold PBS twice, cells 
were lysed in 10011 of 50 mM Tris-HCl pH 8.0, 150mM NaCl (TBS) containing 
2% SDS and boiled at 100°C for 10 min. Lysates were diluted with 900 1l of TBS 
containing 1% NP40. Immunoprecipitation was performed using 1 mg of cellular 
lysates. Ubiquitylated proteins were immunoprecipitated using anti-Myc antibody 
and analysed by western blot using HA antibody. 

Docking of ID2 peptide to the VCB complex. A previously described”, highly 
accurate flexible peptide docking method implemented in ICM software (Molsoft 
LLC, La Jolla CA) was used to dock ID2 peptides to VCB or components thereof. 
A series of overlapping peptides of varying lengths were docked to the complex 
of VHL and Elongin C (EloC), or VHL or EloC alone, from the recent crystal- 
lographic structure” of the VHL-CRL ligase. Briefly, an all-atom model of the 
peptide was docked into grid potentials derived from the X-ray structure using a 
stochastic global optimization in internal coordinates with pseudo-Brownian and 
collective ‘probability-biased’ random moves as implemented in the ICM program. 
Five types of potentials for the peptide-receptor interaction energy — hydrogen 
van der Waals, non-hydrogen van der Waals, hydrogen bonding, hydrophobicity 
and electrostatics — were precomputed on a rectilinear grid with 0.5 A spacing that 
fillsa34A x 34A x 25 A box containing the VHL-EloC (V-C) complex, to which 
the peptide was docked by searching its full conformational space within the space 
of the grid potentials. The preferred docking conformation was identified by the 
lowest energy conformation in the search. The preferred peptide was identified by 
its maximal contact surface area with the respective receptor. 

ab initio folding and analysis of the peptides was performed as previously 

described**“”. ab initio folding of the ID2 peptide and its phospho-T27 mutant 
showed that both strongly prefer an «-helical conformation free (unbound) in 
solution, with the phospho-T27 mutant having a calculated free energy almost 
50 kcal-equivalent units lower than the unmodified peptide. 
RT-PCR. Total RNA was prepared with Trizol reagent (Invitrogen) 
and cDNA was synthesized using SuperScript II Reverse Transcriptase 
(Invitrogen) as described**°°. Semi-quantitative RT-PCR was performed 
using AccuPrime Taq DNA polymerase (Invitrogen) and the following prim- 
ers: for HIF2A Fw 5'_GTGCTCCCACGGCCTGTA_3’ and Rv 5’/_TTGTCA 
CACCTATGGCATATCACA_ 3’; GAPDH Fw 5'‘_AGAAGGCTGGGGC 
TCATTTG_3/ and Rv 5‘_AGGGGCCATCCACAGTCTTC_3’. The quantitative 
RT-PCR was performed with a Roche480 thermal cycler, using SYBR Green PCR 
Master Mix from Applied Biosystem. 

Primers used in qRT-PCR are: SOX2 Fw 5’_TTGCTGCCTCTTTAA 
GACTAGGA_3/ and Rv 5‘_CTGGGGCTCAAACTTCTCTC_3’; NANOG Fw 
5’_ATGCCTCACACGGAGACTGT_3’ and Rv 5'_AAGTGGGTTGTTTGC 
CTTTG_3’; POUSF1 Fw 5'_GTGGAGGAAGCTGACAACAA_3/ and 
Rv 5’/_ATTCTCCAGGTTGCCTCTCA_3/; FLT1 Fw 5’'_AGCCCATA 
AATGGTCTTTGC_3’ and Rv 5‘/_GTGGTTTGCTTGAGCTGTGT_3’; PIK3CA 
Fw 5’/_TGCAAAGAATCAGAACAATGCC_3/ and 5’_CACGGAGGCATT 
CTAAAGTCA_3’; BMI1 Fw 5’_AATCCCCACCTGATGTGTGT_3’ and Rv 
5’_ GCTGGTCTCCAGGTAACGAA_ 3’; GAPDH Fw 5’_GAAGGTGAAG 
GTCGGAGTCAAC_3/ and Rv 5’_CAGAGTTAAAAGCAGCCCTGGT_ 3’; 
18S Fw 5’_CGCCGCTAGAGGTGAAATTC_3/ and Rv 5’_CTTTCGC 
TCTGGTCCGTCTT_3’. The relative amount of specific mRNA was normalized 
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to 18S or GAPDH. Results are presented as the mean + s.d. of three independ- 
ent experiments each performed in triplicate (n= 9). Statistical significance was 
determined by Student's t-test (two-tailed) using GraphPad Prism 6.0 software. 
Subcutaneous and intracranial xenograft glioma models. Mice were housed in 
pathogen-free animal facility. All animal studies were approved by the IACUC at 
Columbia University (numbers AAAE9252; AAAE9956). Mice were 4-6-week-old 
male athymic nude (Nu/Nu, Charles River Laboratories). No statistical method 
was used to pre-determine sample size. No method of randomization was used 
to allocate animals to experimental groups. Mice in the same cage were generally 
part of the same treatment. The investigators were not blinded during outcome 
assessment. In none of the experiments did tumours exceed the maximum volume 
allowed according to our IACUC protocol, specifically 20 mm in the maximum 
diameter. 2 x 10° U87 cells stably expressing a doxycycline inducible lentiviral 
vector coding for DYRK1B or the empty vector were injected subcutaneously in 
the right flank in 10011 volume of saline solution (7 mice per each group). Mice 
carrying 150-220 mm? subcutaneous tumours (21 days after injection) generated 
by cells transduced with DYRK1B were treated with vehicle or doxycycline by 
oral gavage (Vibramycin, Pfizer Labs; 8 mg ml~}, 0.2ml per day)*!; mice carrying 
tumours generated by cells transduced with the empty vector were also fed with 
doxycycline. Tumour diameters were measured daily with a caliper and tumour 
volumes estimated using the formula: width? x length/2 = V (mm). Mice were 
euthanized after 5 days of doxycycline treatment. Tumours were dissected and fixed 
in formalin for immunohistochemical analysis. Data are means + s.d. of 7 mice in 
each group. Statistical significance was determined by ANCOVA using GraphPad 
Prism 6.0 software package (GraphPad). 

Orthotopic implantation of glioma cells was performed as described previously 
using 5 x 10* U87 cells transduced with pLOC-vector, pLOC-DYRK1B (WT) or 
pLOC-DYRK1B-K140R mutant in 2 j1l phosphate buffer”, In brief, 5 days after 
lentiviral infection, cells were injected 2mm lateral and 0.5 mm anterior to the 
bregma, 2.5 mm below the skull of 4-6-week-old athymic nude (Nu/Nu, Charles 
River Laboratories) mice. Mice were monitored daily for abnormal ill effects 
according to AAALAS guidelines and euthanized when neurological symptoms 
were observed. Tumours were dissected and fixed in formalin for immunohisto- 
chemical analysis and immunofluorescence using V5 antibody (Life technologies, 
46-0705) to identify exogenous DYRK1B and an antibody against human vimentin 
(Sigma, V6630) to identify human glioma cells. A Kaplan-Meier survival curve 
was generated using the GraphPad Prism 6.0 software package (GraphPad). Points 
on the curves indicate glioma related deaths (n=7 animals for each group, p was 
determined by log rank analysis). We did not observe non-glioma related deaths. 
Mice injected with U87 cells transduced with pLOC-DYRK1B(WT) that did not 
show neurological signs on day 70 were euthanized for histological evaluation 
and shown as tumour-free mice in Fig. 5g. Intracranial injection of H-Ras-V12- 
IRES-Cre-ER-shp53 lentivirus was performed in 4-week-old Id 1/!Flex, [q2FlowFlox, 
Id3~/~ mice (C57Bl6/SV 129). Briefly, 1.3 ul of purified lentiviral particles in PBS 
were injected 1.45 mm lateral and 1.6 mm anterior to the bregma and 2.3 mm 
below the skull using a stereotaxic frame. Tamoxifen was administered for 5 days 
at 9 mg per 40 g of mouse weight by oral gavage starting 30 days after surgery. Mice 
were killed 2 days later and brains dissected and fixed for histological analysis. 
Immunohistochemistry and immunofluorescence. Tissue preparation and 
immunohistochemistry on tumour xenografts were performed as previously 
described*?"?. Antibodies used in immunostaining are: HIF2a, mouse 
monoclonal, 1:200 (Novus Biological, NB100-132); Olig2, rabbit polyclonal, 
1:200 (IBL International, JP18953); human Vimentin 1:50 (Sigma, V6630), 
Bromodeoxyuridine, mouse monoclonal 1:500 (Roche, 11170376001), V5 1:500 
(Life technologies, 46-0705). Sections were permeabilized in 0.2% tritonX-100 
for 10 min, blocked with 1% BSA-5% goat serum in PBS for 1h. Primary antibod- 
ies were incubated at 4°C overnight. Secondary antibodies biotinylated (Vector 
Laboratories) or conjugated with Alexa594 (1:500, Molecular Probes) were used. 
Slides were counterstained with haematoxylin for immunohistochemistry and 
DNA was counterstained with DAPI (Sigma) for immunofluorescence. Images 
were acquired using an Olympus 1X70 microscope equipped with digital camera 
and processed using Adobe Photoshop CS6 software. BrdU-positive cells were 
quantified by scoring the number of positive cells in five 4 x 10-* mm? images 
from 5 different mice from each group. Blinding was applied during histological 
analysis. Data are presented as means of five different mice + standard deviation 
(s.d.) (two-tailed Student’s t-test, unequal variance). 

Computational analysis of dependency of the HIF2c« regulon on ID2 activity. 
To infer if ID2 modulates the interactions between HIF2q and its transcriptional 
targets we used a modified version of MINDy® algorithm, called CINDy”’. CINDy 
uses adaptive partitioning method to accurately estimate the full conditional mutual 
information between a transcription factor and a target gene given the expression 
or activity of a signalling protein. Briefly, for every pair of transcription factor 
and target gene of interest, it estimates the mutual information that is, how much 


information can be inferred about the target gene when the expression of the tran- 
scription factor is known, conditioned on the expression/activity of the signalling 
protein. It estimates this conditional mutual information by estimating the multi- 
dimensional probability densities after partitioning the sample distribution using 
adaptive partitioning method. We applied CINDy algorithm on gene expression 
data for 548 samples obtained from The Cancer Genome Atlas (TCGA). Since the 
activity level and not the gene expression of ID2 is the determinant of its modula- 
tory function that is, the extent to which it modulates the transcriptional network 
of HIF2a, we used an algorithm called Virtual Inference of Protein-activity by 
Enriched Regulon analysis (VIPER) to infer the activity of ID2 protein from its 
gene expression profile*°. VIPER method allows the computational inference of 
protein activity, on an individual sample basis, from gene expression profile data. 
It uses the expression of genes that are most directly regulated by a given protein, 
such as the targets of a transcription factor (TF), as an accurate reporter of its 
activity. We defined the targets of ID2 by running ARACNe algorithm on 548 
gene expression profiles and use the inferred 106 targets to determine its activity 
(Supplementary Table 3). 

We applied CINDy on 277 targets of HIF2« represented in Ingenuity pathway 
analysis (IPA) and for which gene expression data was available (Supplementary 
Table 4). Of these 277 targets, 77 are significantly modulated by ID2 activity 
(P value < 0.05). Among the set of target genes whose expression was significantly 
positively correlated (P value < 0.05) with the expression of HIF2q irrespective of 
the activity of ID2, that is, correlation was significant for samples with both high 
and low activity of ID2, the average expression of target genes for a given expression 
of HIF2a was higher when the activity of ID2 was high. The same set of target gene 
were more correlated in high ID2 activity samples compared to any set of random 
genes of same size (Fig. 5a), whereas they were not in ID2 low activity samples 
(Fig. 5b). We selected 25% of all samples with the highest/lowest ID2 activity to 
calculate the correlation between HIF2c and its targets. 

To determine whether regulation of ID2 by hypoxia might impact the corre- 
lation between high ID2 activity and HIF2a shown in Fig. 5a, b we compared 
the effects of ID2 activity versus ID2 expression for the transcriptional connec- 
tion between HIF2« and its targets. We selected 25% of all patients (n = 548) 
in TCGA with high ID2 activity and 25% of patients with low ID2 activity and 
tested the enrichment of significantly positively correlated targets of HIF2a in 
each of the groups. This resulted in significant enrichment (P value < 0.001) 
in high ID2 activity but showed no significant enrichment (P value = 0.093) 
in low ID2 activity samples. Moreover, the difference in the enrichment score 
(AES) in these two groups was statistically significant (P value < 0.05). This 
significance is calculated by randomly selecting the same number of genes as 
the positively correlated targets of HIF2qa, and calculating the AES for these 
randomly selected genes, giving AES;ana. We repeated this step 1,000 times to 
obtain 1,000 AES,.nq that are used to build the null distribution (Extended Data 
Fig. 9b). We used the null distribution to estimate P value calculated as (number 
of AES > AES, ana)/1,000. Enrichment was observed only when ID2 activity was 
high but not when ID2 activity was low, thus suggesting that ID2 activity direc- 
tionally impacts the regulation of targets of HIF2« by HIF2a. Consistently, the 
significant AES using ID2 activity suggests that ID2 activity is determinant of 
correlation between HIF2a and its targets. 

Conversely, when we performed similar analysis using [D2 expression instead 
of ID2 activity, we found significant enrichment of positively correlated targets of 
HIEF2q both in samples with high expression (P value= 0.025) and low expression 
of ID2 (P value = 0.048). Given the significant enrichment in both groups, we did 
not observe any significant difference in the enrichment score in the two groups 
(P value of AES = 0.338). Thus, while the determination of the ID2 activity and 
its effects upon the HIF2«-targets connection by VIPER and CINDy allowed us 
to determine the unidirectional positive link between high ID2 activity and HIF2a 
transcription, a similar analysis performed using ID2 expression contemplates the 
dual connection between ID2 and HIF2a. 

Kaplan-Meier analysis for DYRK1A and DYRK1B in human GBM. To test 
if expression of DYRK1A and DYRK1B is a predictor of prognosis, we divided 
the patients into two cohorts based on their relative expression compared to the 
mean expression of all patients in GBM. First cohort contained the patients with 
high expression of both DYRK1A and DYRK1B (m= 101) and the other cohort 
contained patients with low expression (n= 128). We used average expression for 
both DYRK1A and DYRK1B, which individually divide the patient cohort into half 
and half. However, when we use the condition that patients should display higher 
or lower average expression of both these genes, then we select approximately 19% 
for high expression and 24% for low expression. Selection of these patients was 
entirely dependent on the overall expression of these genes in the entire cohort 
rather than a predefined cutoff. Kaplan-Meier survival analysis showed the signif- 
icant survival benefit for the patients having the high expression of both DYRK1A 
and DYRK1B (P value= 0.004) compared to the patients with low expression. 


© 2016 Macmillan Publishers Limited. All rights reserved 


When similar analysis was performed using only the expression of DYRK1A or 
DYRK1B alone, the prediction was either non-significant (DYRK1A) or less sig- 
nificant (DYRK1B, P value = 0.008) when compared to the predictions using the 
expression of both genes. 

Statistics. Results in graphs are expressed as means + s.d. or means + s.e.m., as 
indicated in figure legends, for the indicated number of observations. Statistical 
significance was determined by the Student's t-test (two-tailed, unequal 
variance). P value < 0.05 is considered significant and is indicated in figure 
legends. 
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Extended Data Figure 1 | [D2 is phosphorylated on Ser5, Ser14 and a, The peptide identified as ID2 A3-R8 shows phosphorylation of Ser5. 
Thr27. Chromatographic results of mass spectrometry analysis of ID2 b, The peptide identified as ID2 K12—R24 shows phosphorylation of Ser14. 
protein immunoprecipitated from IMR32 human neuroblastoma cells. c, The peptide identified as ID2 $25-L36 shows phosphorylation of Thr27. 
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Extended Data Figure 2 | T27A missense mutation in [D2 in human 
cancer cells and Thr27 phosphorylation of ID2 by DYRK1 kinases. 

a, Sequence analysis of genomic DNA from the neuroblastoma cell line 
IMR32 shows the wild-type sequence (left). Sequencing of DNA from the 
colon cancer cell line HRT-18 shows a heterozygous mutation resulting 
in the change of codon-27 from ACC (Thr) to GCC (Ala) (right). b, Both 
wild-type and mutant ID2(T27A) are expressed in HRT-18 colon cancer 
cells. Sequence analysis of representative clones (out of 20 clones) derived 
from HRT-18 cDNA demonstrates expression of wild-type (left panel) 
and mutant (right panel) alleles. c, Amino acid sequence flanking Thr27 


B-aCtin 


of ID2 (marked in red), including the DYRK1 consensus motif (bold), 

is evolutionarily conserved. d, In vitro kinase assay using bacterially 
expressed GST-ID proteins and recombinant DYRKI1A. e, U87 cells 
transfected with Flag-ID2, Flag-ID2(T27A) or the empty vector were 
immunoprecipitated with Flag antibody. Co-precipitated proteins were 
analysed by western blot using DYRK1A, DYRK1B and Flag antibodies. 
G-actin was used as control for loading. WCL, whole cellular lysate. 

f, U87 stably transfected with Flag-ID2 were treated with harmine (101M) 
or vehicle for 24h and analysed by western blot using the indicated 
antibodies. 
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Extended Data Figure 3 | DYRK1 kinase activity and Thr27 
phosphorylation of ID2 are inhibited by hypoxia. a, U87 glioma cells 
were treated with 100|1M CoCl for the indicated times. Cellular lysates 
were analysed by western blot using the indicated antibodies. b, SK-N-SH 
cells were treated with 300 |1M CoCl, for the indicated times and assayed 
by western blot using the indicated antibodies. c, Stoichiometric evaluation 
of pThr-27-ID2 in SK-N-SH cells untreated or treated with CoCl, for 24h. 
Cellular lysates prepared in denaturing buffer were immunoprecipitated 
using pT27-ID2 antibody or normal rabbit IgG. Aliquots of whole cellular 
lysates (WCL, j1g) and immunoprecipitates were assayed by western blot 
using pT27-ID2 and non-phosphorylated [D2 antibodies (upper panels). 
The efficiency of immunoprecipitation with anti-pT27-ID2 antibody from 
untreated cells was determined to calculate the percent of the pT27-ID2 

in the absence and in the presence of CoCl, (lower panel). d, 293T cells 
expressing GFP-DYRK1 proteins untreated or treated with 100 1M CoCl, 
for 12 h were used as a source of active kinase. The kinase activity of the 
anti-GFP-DYRK1 immunoprecipitates was tested in vitro using bacterially 
expressed and purified Flag-ID2 as substrate. Kinase reactions were 
evaluated by western blot using p-T27-ID2 antibodies (top). Analysis of 


IB: Vir Uli ee me ee ee ee ee 


kinase reactions by Flag immunoblot shows similar amount of ID2 protein 
in each kinase reaction (middle). Immunocomplexes were analysed by 
western blot using GFP antibody (bottom). e, Lysates from U251 cells 
expressing GFP-DYRK1 proteins untreated or treated with 100 1M CoCl, 
for 6h were immunoprecipitated using GFP antibodies. Western blot was 
performed using anti-p-Tyrosine (p-Tyr) or GFP antibodies. Analysis 

of WCL shows similar expression levels of DYRK1 proteins. a-tubulin 

was used as control for loading. f, Lysates from 293T cells expressing 
GFP-DYRKI1A untreated or treated with 100 1M CoCl, for 12h were 
immunoprecipitated with anti-p-Tyr antibodies and analysed by western 
blot using antibodies against GFP. a-tubulin was used as control for 
loading. g, Lysates from 293T cells expressing GFP-DYRK1B untreated 

or treated with 100 {1M CoCl, for 12h were immunoprecipitated with 
anti-p-Tyr antibodies and analysed by western blot using antibodies 
against GFP. «-tubulin was used as control for loading. h, U87 transfected 
with GFP-DYRK1A, GFP-DYRK1B or GFP and Flag-PHD1, Flag-PHD2, 
or Flag-PHD3 were immunoprecipitated using anti-hydroxyproline 
antibody. Western blot was performed using GFP antibody (upper panels). 
HC, IgG heavy chain. Lower panels, WCL. 
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Extended Data Figure 4 | The DYRK1-ID2 Thr27 pathway controls cells treated as in g. Data in the histograms represent means + s.d. (n=9, 


GSCs and HIF2«a. a, GSC line 48 cells were transduced with lentiviruses 
expressing ID2(WT), ID2(T27A), or the empty vector. b, Cells were 
analysed by in vitro LDA. Representative regression plot used to calculate 
gliomasphere frequency in panel c. c, The frequency of cells capable 

of forming gliomaspheres by in vitro LDA. Data in the histograms 
represent means of 3 biological replicates + s.d.; **P = 0.00163. d, The 
microphotographs show representative gliomasphere cultures of cells 
treated as in a. e, HIF2~ mRNAs from cells treated as in a were analysed 
by semi-quantitative RT-PCR. f, U87 cells stably expressing Flag-ID2 

or Flag-ID2(T27A) were analysed by western blot using the indicated 
antibodies. Arrow points to specific band. Arrowhead indicates Flag-ID2. 
Asterisk indicates endogenous ID2. g, GSC line 34 cells were transduced 
with lentiviruses expressing DYRK1B-V5 or empty vector. Cells were 
analysed by western blot using the indicated antibodies. Arrow points to 
specific band. Asterisk indicates a non-specific band. h, qRT-PCR from 


triplicate experiments each performed in triplicate; ***P = 8.44524 x 10-7 
for TGFA). i, GSC line 31 was transduced with lentiviruses expressing 
DYRK1B-V5 or empty vector. Expression of HIF2a, DYRK1B-V5 and 
a-tubulin was analysed by western blot. j, mRNAs from experiment shown 
in Fig. 3a—c were analysed by semiquantitative RT-PCR for HIF2a. 

k, GSC line 31 cells were transduced with lentiviruses expressing DYRK1B 
and ID2, ID2(T27A), or the empty vector. Cells were analysed by LDA. 
Representative regression plot used to calculate gliomasphere frequency 
in Fig. 3b. 1, GSC cell line 31 cells were transduced with lentiviruses 
expressing DYRK1B or the empty vector in the absence or in the presence 
of undegradable HIF20 (HIF2a-TM). Cells were analysed by in vitro 
LDA. Representative regression plot used to calculate the frequency of 
gliomaspheres in cultures from three independent infections (Vect plus 
Vect = 13.55%; DYRK1B-Vect = 4.36%; DYRK1B-HIF2a-TM = 9.73%). 
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Extended Data Figure 5 | The DYRK1-ID2-Thr27 pathway modulates 
HIFa stability by regulating the interaction between ID2 and VHL. 
a, In vivo ubiquitylation of HIF 1a protein. U87 cells transfected with 
the expression plasmids HIFla and MYC-ubiquitin were co-transfected 
with Flag-ID2, Flag-ID2(T27A), or the empty vector in the presence or 
in the absence of GFP-DYRK1B. After treatment with MG132 (201M) 
for 6h, lysates were prepared in denaturing buffer and identical aliquots 
were immunoprecipitated with antibodies directed against MYC. An 
anti-HA antibody was used to detect HIF1a ubiquitin conjugates (left); 
Cellular lysates, WCL, were analysed by western blot using the indicated 
antibodies (right). b, U87 cells were co-transfected with plasmids 
expressing HA~HIF2a and GFP-DYRK1B or GFP-vector. Cells were 
treated with 50 jg ml“! of CHX for the indicated times and analysed by 
western blot. c, Quantification of HIF2« protein from the experiment 
in panel b as the log» of the percent of HIF2a relative to untreated cells. 
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d, IMR32 cells were co-transfected with ID2 and Flag-VHL or Flag—HIFla 
expression vectors. Immunoprecipitation was performed using Flag 
antibody and immunocomplexes and whole cellular lysates (WCL) were 
analysed by western blot using the indicated antibodies. e, IMR32 cells 
transfected with Flag~VHL expression vector were used for IgG or ID2 
antibody immunoprecipitation. Immunocomplexes and WCL were 
analysed by western blot. Arrow points to the specific Flag-VHL band; 
asterisk indicates IgG light chain. f, Flag immunoprecipitation of binding 
reactions of in vitro translated Flag-ID and HA-Elongin C proteins. 
Immunocomplexes were analysed by western blot for HA and Flag. 

g, Flag-ID proteins and HA-VHL were translated and incubated in vitro. 
Flag immunocomplexes were analysed by western blot for HA and Flag. 
h, In vitro streptavidin pulldown assay of biotinylated ID2 peptides (amino 
acid 14-34 (WT), pT27, and T27W) and in vitro translated HA-VHL. 
Bound polypeptides were detected by western blot. 
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Extended Data Figure 6 | Molecular docking of an ID2 (15-31) 
peptide on the VHL-Elongin C complex. a, Ribbon representation of 
the backbone of the VHL-Elongin C complex and the predicted binding 
conformation of the ID2 peptide. VHL (red ribbon), Elongin C (blue 
ribbon) and the docked ID2 peptide (purple ribbon). Cul2 contact 
residues are colored yellow ribbon in both VHL and Elongin C. Arrow 
indicates the ID2 peptide. b, Docking result for the phospho-Thr-27-ID2 
peptide shown from the same perspective as in panel a. c, The view and 


ID2 peptide N-term 


complex in a is rotated 90 degrees around an axis parallel to the page so 
that the perspective is from the arrow shown in panel a. d, Electrostatic 
molecular surface representation of the VHL—Elongin C complex with 
the docked ID2 peptide. The perspective is the same as in panel c. The 
T27 side chain is shown as space-filling spheres and is indicates by the red 
arrow. The N-terminus and C-terminus of the ID2 peptide are indicated 
by purple arrows. 
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Extended Data Figure 7 | DYRK1-mediated phosphorylation of ID2 
prevents dissociation of the VCB-Cul2 complex. a, In vivo binding assay 
using lysates from U87 cells co-transfected with HA-VHL and Flag-ID2 
or Flag-ID2(T27E) expression vectors. Flag immunocomplexes were 
analysed by western blot using HA and Flag antibodies. Whole cell lysates, 
WCL, were analysed by western blot using the indicated antibodies. 
Binding of Flag-ID2 and Flag-ID2(T27E) to the bHLH protein E47 is 
shown as control for ID2 binding. b, U87 cells were transfected with Flag- 
ID2, Flag-ID2(T27A) or Flag-ID2(T27E) plasmids. Cellular lysates were 
analysed by western blot using the indicated antibodies. c, In vitro binding 
between purified Flag-ID2 and His-VHL following in vitro kinase reaction 
using recombinant DYRK1B and Flag-ID2. d, Analysis of the HA-Elongin 
C immunocomplexes in U87 cells transfected with HA-Elongin C in the 
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absence or presence of Flag-ID2(T27A). Anti-HA immunoprecipitation 
reactions and WCL were analysed by western blot using antibodies against 
Cul2, HA (Elongin C), and Flag (ID2). e, Analysis of the Flag-SOCS2 
immunocomplexes in U87 cells transfected with ID2, ID2(T27A) or the 
empty vector. Flag immunoprecipitation reactions and WCL were analysed 
by western blot using antibodies against Cul5, ID2, and Flag (SOCS2). 

f, Stoichiometric analysis of ID2 and VHL in cellular lysates. Decreasing 
amount of WCL from 1 x 10° U87 cells and purified proteins were assayed 
by western blot (left). Regression plots of densitometry analysis were used 
to determine ID2 and VHL protein concentration and the ID2:VHL ratio 
(right). g, Immunoprecipitation of endogenous VHL in U87 cells in the 
presence and in the absence of CoCl). Western blot for Cul2 and VHL are 
analysed by western blot. Vinculin is shown as loading control. 
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Extended Data Figure 8 | DYRK1 kinase inhibits proliferation of 
human glioma. a, Malignant glioma were induced in Id] /!0/Flox. 
Id2Plex/Flox_1q3 mice via injection of lentivirus expressing H-RAS-V12- 
IRES-CRE-ER linked to U6-shp53 cassette into the dentate gyrus as 
described’. Mice were treated for 5 days with tamoxifen or vehicle and 
euthanized 2 days later. Tumours were analysed by immunohistochemistry 
using HIF2a and OLIG2 antibodies. Nuclei were counterstained with 
haematoxylin. b, Western blot analysis of DYRK1B in U87 cells stably 
expressing a doxycycline inducible DYRK1B or the empty vector. Cells 
were treated with 0.75 1g ml! doxycycline or vehicle for 36h. Lysates 
of adult mouse cortex (CX) and cerebellum (CB) were used to compare 
exogenous DYRK1B with endogenous levels of the protein. c, Tissue 
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sections from experiment in Fig. 5e, f were analysed by immunostaining 
using BrdU antibodies. d, Quantification of BrdU positive cells from 

the experiment in c. Data in the histograms represent means = s.d. 
(n=5; ***P = 3.065 x 10°’, DYRK1B - Dox versus DYRK1B + Dox). 
Asterisks indicate statistical significance by two-tailed t-test. e, Western 
blot analysis of ectopically expressed V5-DYRK1B, V5-DYRK1B-K140R 
in U87 cells. f, Brain cross-sections of mice intracranially injected with 
U87 cells in e were analysed by immunofluorescence using V5 antibody 
(red, upper panels) to identify exogenous DYRK1B and human vimentin 
antibody (red, lower panels) to identify human glioma cells. Nuclei were 
counterstained with DAPI (blue). T, tumour; B, brain. 
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Extended Data Figure 9 | Analysis of DYRK1A, DYRK1B and ID2 
expression in human GBM. a, Scatter plot showing the expression of 
DYRK1A and DYRK1B in GBM. Blue and red dots indicate GBM samples 
with high or low expression of both DYRK1A and DYRK1B, respectively. 
GBM samples were used for Kaplan-Meier survival analysis to evaluate 
the prognostic power of the expression of DYRK1A and DYRK1B shown 
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in Fig. 5i. b, Distribution of AES,ang, representing the null model, for 

ID2 activity (left) and ID2 expression (right). This distribution is used to 
calculate the P value for enrichment of AES. Red dot (or vertical black bar) 
represents the AES using HIF2a targets. The P value is calculated as ratio 
of number of times AES,anq is greater than AES (falls in green regions) 
over the total trials (= 1000). 
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Extended Data Figure 10 | Model for the regulation of HIFa stability 
by the DYRK1 kinase and ID2 pathway. In cellular contexts that 
favour HIFa protein instability (normal oxygen levels, but also low ID2 
expression and high DYRK1 expression) prolyl hydroxylases (PHD 1) 

is active and positively regulates DYRK1 kinases. Active, tyrosine 
phosphorylated DYRK1 kinases keep ID2 under functional constraint 
by phosphorylation of Thr27. The VCB-Cul2 ubiquitin ligase complex 
efficiently ubiquitylates HIFa (left). With decreasing oxygenation and 
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PHD1 inactivation but also in the presence of downregulation of 
DYRK1, elevated expression of ID2, or ID2(T27A) mutation, the 
un-phosphorylated/un-phosphorylatable pool of ID2 exerts an inhibitory 
function towards the VCB-Cul2 complex by binding directly VHL 

and Elongin C proteins and displacing Cul2. This results in HIFa 
accumulation (right). The transcriptional activation of the ID2 gene, a 
HIFa target, by HIF2a generates a feed-forward ID2-HIF2a loop that 
amplifies the effects. 
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Eight per cent leakage of Lyman continuum photons 
from a compact, star-forming dwarf galaxy 


Y. I. Izotov!, I. Orlitova2, D. Schaerer?*, T. X. Thuan®, A. Verhamme?, N. G. Guseva! & G. Worseck® 


One of the key questions in observational cosmology is the 
identification of the sources responsible for ionization of the 
Universe after the cosmic ‘Dark Ages, when the baryonic matter was 
neutral. The currently identified distant galaxies are insufficient to 
fully reionize the Universe by redshift z~ 6 (refs 1-3), but low-mass, 
star-forming galaxies are thought to be responsible for the bulk of 
the ionizing radiation* ©. As direct observations at high redshift are 
difficult for a variety of reasons, one solution is to identify local 
proxies of this galaxy population. Starburst galaxies at low redshifts, 
however, generally are opaque to Lyman continuum photons’. 
Small escape fractions of about 1 to 3 per cent, insufficient to ionize 
much surrounding gas, have been detected only in three low-redshift 
galaxies, Here we report far-ultraviolet observations of the 
nearby low-mass star-forming galaxy J0925+1403. The galaxy is 
leaking ionizing radiation with an escape fraction of about 8 per 
cent. The total number of photons emitted during the starburst 
phase is sufficient to ionize intergalactic medium material that is 
about 40 times as massive as the stellar mass of the galaxy. 

So-called ‘Green Peas’ (GPs), low-mass compact galaxies with very 
active star formation'?"!°, may be promising candidates for sources of 
escaping ionizing radiation. The GP galaxy J0925+ 1403 was selected 
from the Sloan Digital Sky Survey (SDSS) according to the following 
properties (see Methods for details): (1) a compact structure; (2) the 
presence of emission lines with high equivalent widths in its SDSS 
spectrum, suggesting active ongoing star formation and numerous 
hot O stars producing ionizing Lyman continuum radiation; (3) suffi- 
ciently bright in the far-ultraviolet (FUV) with a magnitude of 20.7 mag 
and redshifted enough (z= 0.301) to allow direct Lyman continuum 
observations with the Cosmic Origins Spectrograph (COS) on board 
the Hubble Space Telescope (HST); and (4) a high ratio of flux from 
the [O 11] \=5,007 A line to that from the [O 1] \=3,727 A line, or 
O32 = [O 111] A5,007/[O 11] \3,727 =5 (see Fig. 1), which may indicate 
the presence of density-bounded H 11 regions'*"!°, that is, escaping 
Lyman continuum radiation. 

We first derive some general properties of the galaxy, using the emis- 
sion-line fluxes measured from the SDSS optical spectrum. After cor- 
rection for the Milky Way extinction of Ay,mw = 0.084 mag, we obtain 
an internal extinction Ayint = 0.36 mag, and a low oxygen abundance 
12 +log(O/H) =7.91 + 0.03, or less than 0.2 times the solar value. The 
details of these determinations are given in Methods section. Here and 
elsewhere in this Letter, errors are +lo. 

The same SDSS spectrum is used to fit a spectral energy distribu- 
tion (SED) to derive the galaxy’s global parameters, including the 
stellar mass and the age of the present burst of star formation (see 
Methods). We obtain a starburst age of 2.6 + 0.2 Myr, a young stel- 
lar mass of (2.4 + 0.3) x 108Mo, and a total galaxy stellar mass of 
(8.2 + 0.7) x 10°Mo (Mo, solar mass). The star-formation rate is 
52.2Mo yr}, as determined from the extinction-corrected H@ line 
flux. With its low mass, low metallicity, low extinction, compact 


morphology, and high star-formation rate, J0925+1403 shares many 
of the properties of high-redshift Lyman-a (Lya) emitters. 

GPs with O32 > 5 have been observed before by HST!”!8, but their 
low redshifts z< 0.3 were not optimal for Lyman continuum observa- 
tions. The HST/COS observations of J0925+1403 were obtained on 
28 March 2015 (program GO13744; PI, T.X.T.). The near-ultraviolet 
acquisition image shows the galaxy to have a very compact structure, 
with a half-light angular diameter of ~0.2”, much smaller than the 
spectroscopic aperture of 2.5’ (Fig. 2). This angular diameter corre- 
sponds to a linear diameter of ~1 kpc at the angular diameter distance 
of 930 Mpc, derived from the redshift z= 0.301, adopting the Planck 
mission cosmological parameters Hy =67.1kms~!Mpc7!, 2, =0.682 
and §2,,=0.318 (ref. 19). 

Spectra of J0925+1403 were obtained with two gratings. The 
low-resolution G140L grating (<900-2,385 A) was used to obtain the 
spectrum that includes the redshifted Lyman continuum emission, with 
an exposure time of 5,649 s. The medium-resolution G160M grating 
(1,410-1,796 A) was used to obtain the spectrum that includes the 
redshifted Lya 1,216 A line, with an exposure time of 2,978 s. The 
observations with the G160M and G140L gratings were reduced with 
the standard pipeline and custom software, respectively. The custom 
software gives more accurate results, as it is specifically designed for 
faint HST/COS targets (see Methods section). 
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Figure 1 | The 03.-R2; diagram for star-forming galaxies. The quantity 
Ry; is the total flux of the strongest oxygen lines in the optical spectrum 
relative to Hf, and is given by ([O 111]4,959 + [O 111]5,007 + [O 11]3,727)/ 
HB. This quantity is used for easier comparison with high-redshift 

Lya emitting galaxies potentially leaking ionizing radiation!®, shown 

by open triangles. O3, is defined in the main text. At low metallicities, 

12 + log(O/H) < 8.3, Ro3 increases with the metallicity. The location 

of J0925+ 1403 and known low-redshift Lyman continuum leaking 
galaxies!!! are shown by the filled star and filled squares, respectively. 
SDSS star-forming galaxies”’ are represented by small dots. 
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J0925+1403 


1” = 4.50 kpc 


Figure 2 | Near-ultraviolet image of J0925+1403 from the HST. 

The galaxy with a linear diameter of ~2 kpc consists of two compact 
star-forming regions superimposed on an extended low-surface-brightness 
component. The spectroscopic aperture with a diameter of 2.5” is shown 
by a circle. 


A strong Lya 1,216 A emission line is detected in the medium- 
resolution spectrum. Its profile (Fig. 3) shows two peaks, one on each 
side of the line centre (dashed vertical line). According to radiative 
transfer models, the separation between the Lya line peaks increases 
with increasing optical depth and thus with increasing neutral hydro- 
gen column density N(H 1) (ref. 20). In the case of J0925+1403, the 
separation of ~300kms~! corresponds to a low column density 
(N(H 1) < 10'°cm7”), allowing the escape of a considerable fraction of 
the Lya emission. Correcting for the Milky Way and galaxy internal 
reddening, we obtain a Lya flux density of 8.2 x 107'“ergs~! cm’. 
Comparing the extinction-corrected Lya/Hf flux ratio of 16.7 + 1.0 
and case B flux ratio of 23.3 (ref. 21), we find that the Lya escape frac- 
tion is ~70%, among the highest known so far for GP galaxies'®, and 
consistent with a low H 1 column density. 

The short-wavelength part of the J0925+ 1403 spectrum, obtained 
with the low-resolution grating G140L, is shown in Fig. 4a by the 
grey solid line. The modelled ultraviolet SED of the young cluster 
with the age and extinction parameters obtained before from SED 
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Figure 3 | The double-peaked Lya emission line in the COS spectrum of 
J0925+1403. The centre of the emission line is shown by a vertical dashed 
line. The small separation of the two emission peaks is indicative of a low 
Hrcolumn density according to radiation transfer models for spherical 
geometry””. 
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fitting in the optical range (see Methods) is shown by the black solid 
line. We adopt the reddening curve from ref. 22, corresponding to 
Rymw =Av/Eg_y =3.1 for the Milky Way and to Ryjint=2.4 for J0925+ 
1403, except for \ < 1,250 A, where the reddening curve from ref. 23 
is used. The corresponding intrinsic SED is shown in Fig. 4a by the 
black dash-dotted line. The flux density of the intrinsic Lyman con- 
tinuum is determined primarily by the extinction-corrected flux den- 
sity of the HB emission line and the starburst age. The starburst age 
is derived from the condition that the observed equivalent width of 
the H6 emission line is equal to the modelled value, which depends 
on the extinction-corrected flux of the continuum near H®6 and the 
intrinsic Lyman continuum flux. The intrinsic Lyman continuum is 
fairly insensitive to the adopted stellar evolution models, stellar atmos- 
phere models and initial mass function (see Methods). It is seen that 
the reddened SED reproduces the observed spectrum very well for 
rest-frame wavelengths >912 A. 

Figure 4b shows a blow-up of the Lyman continuum spectral region. 
The important feature to note is that the Lyman continuum flux den- 
sity for \< 912A is not zero, but positive with a value equal to (2.35 + 
0.20) x 10-!’ergs~!'cm~? A“, when averaged over the 860-913 A rest- 
frame spectral range. It is detected at the 11.6a level and is indicated 
by a dotted horizontal line and a filled circle in Fig. 4b. This observed 
Lyman continuum should be corrected for the Milky Way extinction 
before the determination of the Lyman continuum escape fraction. 
The extinction-corrected average Lyman continuum flux density of 
(3.43 + 0.29) x 10-!7ergs~!cm~? A“ is shown by the thick solid 
horizontal line in Fig. 4b. Comparing this value to the intrinsic 
continuum flux density of 4.4 x 107!Sergs~!cm~* A“! beyond the 
Lyman limit for \< 912 A, we obtain an absolute Lyman continuum 
escape fraction fosc = 7.8% + 1.1% in J0925+ 1403, where the error 
is determined by the observed Lyman continuum flux density error 
and uncertainties in the modelled intrinsic Lyman continuum flux 
density. This value is several times higher than fesc of the other three 
low-redshift galaxies with known Lyman continuum leakage. J0925+ 
1403 also has the highest [O 11]/[O 11] flux ratio, the lowest metal- 
licity and the lowest stellar mass. Thus we conclude that compact 
low-mass star-forming galaxies with high [O 11]/[O 1] ratios may 
lose a considerable fraction of their Lyman continuum emission to 
the intergalactic medium (IGM). 

The above determination of f.,. holds for ultraviolet-emitting 
star-forming regions without dust-obscured star formation, which is 
invisible in the ultraviolet and/or optical ranges. The sky region con- 
taining the galaxy J0925+1403 has been observed in the mid-infrared 
range by the Wide-field Infrared Survey Explorer (WISE). However, 
data for this galaxy are not present in the AIIWISE Source Catalog™*. 
There are also no data in the radio range. Optical, near- and mid- 
infrared observations of other low-metallicity star-forming galaxies 
with similar properties suggest that they are relatively transparent”. 
Negligible dust-obscured star formation is also implied by the observed 
thermal free-free centimetre radio emission in dwarf galaxies with 
optical and infrared observations, as it is consistent with the value 
derived from the flux density of the H emission line”®. Presumably, 
the same conclusion holds for J0925+1403. 

The number of ionizing photons escaping the galaxy is 
Qu = 3.86 x 10°98! if fogc = 7.8% (Methods section), corresponding 
to a total number of ionizing photons of 3.6 x 10°’ emitted during a 
starburst with a 3 Myr duration. This total number of photons is suffi- 
cient to ionize the low-density IGM gas with a mass of ~4 x 10'°Mo, 
or about 40 times higher than the stellar mass of the galaxy, assuming 
one photon suffices to ionize one hydrogen atom. Here we adopt 
the luminosity distance of 1,620 Mpc. Finally, we note that our gal- 
axy leaks a large number of ionizing photons, Qy/L1,500 © 107° pho- 
tonss~'/(ergs~'Hz7'), per unit ultraviolet luminosity, approximately 
three times more than (optimistic) assumptions! used at high redshift 
for fesc= 0.2, which is primarily due to the young age of the ultra- 
violet-dominant stellar population in J0925+ 1403. 
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Figure 4 | The COS spectrum of J0925+1403. a, On top of the spectrum 
(grey line) is superposed the modelled SED, reddened by both the Milky 
Way and internal extinctions (black solid line). The unreddened SED is 
shown by the dash-dotted line. The short, horizontal dotted line indicates 
the average observed flux density of the Lyman continuum. b, The 
spectrum in the wavelength range 860-913 A. The average observed value 
with +30 error bars is shown by the filled circle and the dotted line. The 
solid line represents the Lyman continuum level after correction for the 
Milky Way extinction. 
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METHODS 


The sample of compact star-forming galaxies. A sample of compact star-forming 
galaxies was selected from the spectroscopic data base of the SDSS Data Release 
10 (DR10)”8 by applying selection criteria as follows: (1) Rso <3”, where Rs is 
the galaxy’s Petrosian radius, within which 50% of the galaxy’s flux in the SDSS 
r band is contained; (2) spiral galaxies were excluded; (3) the emission-line ratio 
[O m1]. \4,959/HB is >1 to include only galaxies with high-excitation H 1 regions, 
ensuring an accurate determination of their extinctions and chemical composi- 
tions; (4) galaxies with AGN (active galactic nuclei) activity were excluded using 
line ratios as described below. Applying these criteria, 5,182 galaxies were found in 
the redshift range 0 <z< 1. Of these, 25 galaxies have O32 >5 and are located in the 
redshift range 0.30-0.35, making direct Lyman continuum observations possible. 
J0925+1403 with equatorial coordinates RA = 09:25:32.37 and dec. = +14:03:13.06 
was selected for the HST/COS ultraviolet observations as one of the brightest 
objects in the sample of 25 galaxies, with a FUV magnitude of 20.7 mag from 
GALEX. The high rest-frame equivalent width EWyg = 177 A of the H8 emission 
line in the SDSS spectrum of J0925+ 1403 suggests active ongoing star formation. 
The [O 111] A5,007/HB-[N 11] \6,584/Ha diagnostic diagram” of extragalactic 
objects with emission-line spectra, shown in Extended Data Fig. 1, is useful for dis- 
criminating between star-forming galaxies (SFG) and active galactic nuclei (AGN) 
as ionizing sources. The solid line represents the model line*® that separates the 
two types of objects. Extended Data Figure 1 clearly shows that the gas in J0925+ 
1403 is ionized by hot stars in star-forming regions. 
Custom HST/COS G140L data reduction. J0925+1403 was observed with the 
HST/COS grating G140L for two orbits at a central wavelength of 1,280 A in all 
four focal-plane offset positions to minimize fixed-pattern noise and to patch grid- 
wire shadows and other detector blemishes. The data were reduced with CALCOS 
v2.21 and custom software specifically designed for faint HST/COS targets*!*, 
improving upon previous results on Lyman continuum leakage obtained with the 
default CALCOS pipeline products!!. As our spectra were taken at COS Detector 
Lifetime Position 3, we did not employ pulse height filtering to preserve source 
flux in the presence of detector gain sag from previous usage of COS at the nearby 
COS Lifetime Position 1. We used boxcar extraction in a narrow 25 pixel rectan- 
gular window that preserves spectrophotometry for compact sources (Fig. 2) while 
minimizing the background. The COS background is dominated by detector dark 
current, which was subtracted in post-processing using scaled dark exposures to 
account for gain sag in the COS aperture. Specifically, we co-added dark exposures 
taken in similar orbital conditions within 3 months of the science observations, 
smoothed the dark current in the COS aperture with a 500-pixel running average to 
remove Poisson fluctuations, and rescaled the smoothed dark current to the science 
observations using unilluminated regions of the detector. The resulting estimate of 
the COS dark current is sufficiently accurate (~5%) over the wavelength range of 
interest (1,100-1,180 A) to not affect our analysis. Subexposures were co-added by 
summing raw counts and the smoothed dark current, accounting for differing pixel 
exposure times due to detector blemishes before converting to flux via the COS 
calibration curve. This procedure preserves the Poisson counts of faint sources. 
Airglow contamination (N 1 \1,134A and O1 \1,304 A) was eliminated by con- 
sidering only data taken in orbital night in the affected wavelength ranges. We also 
verified with orbital night data that scattered H 1 Lya airglow is negligible in the 
Lyman continuum of J0925+1403. The sum of diffuse open-shutter backgrounds 
(earthshine, zodiacal light, Galactic emission*’) is a factor of ~30 smaller than the 
measured flux, leading us to the conclusion that the measured flux is indeed source 
Lyman continuum flux and not unaccounted background. 
Extinction in the optical range. Using the observed decrement of several hydro- 
gen Balmer emission lines, we corrected the line fluxes relative to the HB flux 
for two effects: (1) reddening adopting the extinction curve from ref. 22 and 
(2) underlying hydrogen stellar absorption. Both are obtained simultaneously by 
an iterative procedure**. The quantity derived from the hydrogen Balmer decre- 
ment is the extinction coefficient Cys, corresponding to the extinction at the HB 
wavelength Ayg = 2.512 x Cy. Using the A)/Ay fits’, we approximate the ratio 
Cup/Ay by the relation 


Cup / Ay = 0.6633 — 0.1317Ry + 0.0294R2, — 0.0024R%, (1) 


with an accuracy better than 0.1% in the range Ry = Ay/Ep—y = 2.0-4.0. We note 
that Cy3/Ay is weakly dependent on Ry, decreasing by <10% with Ry increasing 
from 2 to 4. 

The correction for reddening was done in two steps. First, the observed spec- 
trum, uncorrected for redshift, was corrected for the Milky Way extinction with 
Ay mw = 0.084 mag and Ry mw =3.1 (NASA Extragalactic Database), corresponding 
to Cys3,mw = 0.039. Then, the rest-frame spectrum was corrected for the internal 
extinction of J0925+1403, obtained as Cy3,int = 0.175 from the hydrogen Balmer 
decrement, corresponding to Ay,nt = 0.38 mag for Ry,int=3.1 and Ay,int = 0.36 mag 
for Ryint= 2.4. The extinction-corrected emission-line fluxes relative to the HB 
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emission line flux, I(A)/I(H8), and the rest-frame equivalent widths, EW()), are 
shown in Extended Data Table 1. 

Element abundances. The [O 111] 44,363 A emission line is detected in the SDSS 
spectrum of J0925+1403. This allows for a reliable oxygen abundance determina- 
tion using the direct T.-method. The temperature T.(O 111) is calculated based on 
the [O 111] \4,363/(4,959 + 5,007) line ratio*°. We adopt a two-zone photoion- 
ized H u region model: a high-ionization zone with temperature T.(O 111), where 
[O 111] lines originate, and a low-ionization zone with temperature T.(O 11), where 
[O u1] lines originate. For T.(O 11), we use a model relation between the electron 
temperatures T.(O 111) and T.(O 1) (ref. 35). Ionic and total abundances of oxygen, 
nitrogen and neon are derived using expressions for ionic abundances and ioniza- 
tion correction factors (ICF)*°. The derived temperatures and element abundances 
are shown in Extended Data Table 2. We note that the weak [S 1] \6,717, 6,731 
emission lines, which are used for the electron number density determination, 
can not be measured because they are located in the noisy part of the J0925+1403 
spectrum. Therefore, we adopted the value N.= 100 cm, typical of extragalactic 
H regions. 

Fitting of SED and determination of galaxy global parameters. The luminosity, 
stellar mass and star-formation rate are important global galaxy characteristics. 
For the determination of the stellar mass, we follow earlier approach for modelling 
dwarf galaxies'?°*”3°, The method is based on fitting a series of model SEDs to 
the observed one and finding the best fit. The fit was performed for the SDSS 
spectrum over the entire observed spectral range of 3,900-9,200 A. As the SED 
is the sum of both stellar and ionized gas emission, its shape depends on the rela- 
tive contribution of these two components. In J0925+1403, with a rest-frame HB 
equivalent width EWyg = 177 A, the ionized gas continuum is strong and should 
be subtracted before determining the stellar mass. 

We carried out a series of Monte Carlo simulations to reproduce the SED of 
J0925+1403. To derive the stellar SED, we use a grid of instantaneous burst SEDs 
in a wide range of ages from 0.0 Myr to 15 Gyr, calculated with Starburst99°”*. 
We adopted Geneva stellar evolution tracks for non-rotating and rotating stars*”"° 
and Padova stellar evolution tracks"!. Various models of stellar atmospheres were 
used’, and we adopt two different stellar initial mass functions***”. Then the 
SED with any star-formation history can be obtained by integrating the instanta- 
neous burst SEDs over time with a specified time-varying star-formation rate. The 
SED of the gaseous continuum was taken into account. It included hydrogen and 
helium free-bound, free-free and two-photon emission“. 

The star-formation history is approximated assuming a recent short burst with 
age ty < 10 Myr, which accounts for the young stellar population, and a prior con- 
tinuous star formation for the older stars during the time interval between initial 
time f, and final time t (t< tj and zero age is now). The contribution of each stellar 
population to the SED was parameterized by the ratio b = M,/Mo, where M, and 
M, are respectively the masses of the young and old stellar populations. Then the 
total stellar mass is M, = My + Mo. 

We calculated 10° Monte Carlo models by randomly varying ty fj, tsand b, while 
other parameters, such as evolutionary tracks, stellar atmosphere models, initial 
mass function and metallicity are kept fixed. In all cases we used models with 
metallicities that best match the metallicity of J0925+1403. We also calculate the 
equivalent width EWyg for each model. The best solution is required to fulfil two 
conditions. First, only models in which the modelled equivalent width EWy4, of the 
H6 emission line agrees with the observed value within 5% were selected. Second, 
the best modelled SED among selected models for each set of fixed parameters 
was found from 7 minimization of the deviation between the modelled and the 
observed continuum in five wavelength ranges, which are free of the emission lines 
and residuals of the night-sky lines. We found best solutions for 33 combinations 
of evolutionary tracks, stellar atmosphere models and initial mass functions to 
investigate the dependence of the Lyman continuum flux density on the input 
parameters. All these solutions provide almost equally good fits at rest-frame wave- 
lengths greater than 912 A and small variations of Lyman continuum. The optical 
spectrum of J0925-+ 1403 with the overlaid SED of one out of the 33 best models is 
shown in Extended Data Fig. 2. Here we adopted a Salpeter initial mass function*®, 
Padova evolutionary tracks‘! and a combination of stellar atmosphere models’. 

The modelled SED in the ultraviolet range is shown in Fig. 4a by a dash-dotted 
line. We note that the amount of the flux decrease at the Lyman break at a fixed 
starburst age depends on the adopted stellar evolution tracks, stellar atmosphere 
models, and initial mass function. In particular, the Lyman break in models with 
non-rotating stars is stronger than in models with rotating stars. However, we find 
from the SED fitting that the intrinsic flux density I(\912) of the Lyman continuum 
varies only by ~10% in the various sets of models. This is primarily due to the fact 
that to fit the observed EWy, the starburst age in models with rotating stars should 
be greater than that in models with non-rotating stars**. For young bursts with an 
age of 0-3 Myr we obtained the approximate relation I(\912) © I(H8)/12, which 
may be used to estimate the intrinsic Lyman continuum flux density in galaxies 
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with f,,. <1 without modelling. We also note that this relation is not much 
changed if continuous star formation is adopted for the young stellar population 
instead of an instantaneous burst. This is due to the dominant contribution of the 
0-3 Myr stellar population to both the Lyman continuum and Hf fluxes, while the 
contribution of older stars is small. 

To derive global characteristics, the observed fluxes were transformed to lumi- 

nosities adopting the luminosity distance 1,620 Mpc (ref. 49) and the cosmological 
parameters Hy) =67.1kms~!Mpc™!, 24 =0.682, Qn=0.318 (ref. 19). Additionally, 
for the determination of the HB luminosity Ly and star-formation rate, the H8 flux 
density corrected for aperture effects using the relation 2.512’P?)~", where r and 
r(app) are, respectively, the SDSS r-band total magnitude and the magnitude within 
the round spectroscopic 3” diameter aperture. The aperture correction for J0925+ 
1403 is 1.54. The derived Hf luminosity (Extended Data Table 3) corresponds to a 
number of ionizing photons Quy = 4.94 x 10°'s~!, in good agreement with the Qy 
obtained from the integration of the number of ionizing photons in the modelled 
intrinsic spectrum (it is only ~8% lower). The corresponding star-formation rate 
is derived using the standard Kennicutt relation. 
Reddening law for J0925+1403. The comparison of the modelled SEDs with the 
observed photometric and spectroscopic data in the entire ultraviolet and optical 
range allows to verify which extinction curve is most applicable to J0925+ 1403. It 
is known that the extinction curve in the ultraviolet range for the Small Magellanic 
Cloud with oxygen abundance 12 + logO/H ~ 8.1 is much steeper than the Milky 
Way extinction curve, and is characterized by Ry 2.7 (refs 51-53). On the other 
hand, the extinction curve in the optical range is insensitive to variations of Ry. 
The oxygen abundance of J0925+ 1403 is lower, ~7.9. Therefore we may expect the 
extinction curve in this galaxy to be characterized by an even lower Ry. 

The observed ultraviolet and optical spectra are shown in Extended Data Fig. 3 
by grey lines. Their fluxes are consistent with the SDSS and GALEX photometric 
fluxes shown by the black symbols. To fit these observational data we reddened the 
intrinsic modelled SED, keeping the same Milky Way extinction Cyg,mw = 0.039, 
Rymw = 3.1, and the internal extinction Cyg,in = 0.175, but varying Ryint= 3.1 
(dotted line), 2.7 (dashed line) and 2.4 (solid line). The modelled SEDs were calcu- 
lated with Padova stellar evolution models", a combination of stellar atmosphere 
models”? and a Salpeter initial mass function”. It is seen that the SEDs reddened 
with Ryint=3.1 and 2.7 do not fit the observed fluxes in the ultraviolet range. A 
higher extinction coefficient Cy3,int would be needed in the ultraviolet. This is 
difficult to understand in the framework of a model with an uniform dust distri- 
bution, characterized by a single value of the extinction. On the other hand, a SED 
reddened by the same extinction Cyg,int = 0.175, but with Ryjint= 2.4, nicely repro- 
duces the observed fluxes. If we use different stellar atmosphere models for the SED 
modelling, we find somewhat larger Ry,int in some cases, but not exceeding 2.7. 
Relative Lyman continuum escape fraction. To compare with other studies, 
we also estimate the Lyman continuum escape fraction from the often-used 
equation’’, 


= —-0.4xA me (fso0 / fooo )int -94xA 
Jose ~~ Sesc,rel x 10-94% A1,500 — SE200 "A700" — x 10-04% 41,500 (2) 


(f..s00 / fooo Jobs 


where fooo and f;,500 are the flux densities at rest-frame wavelengths 900 Aand 
1,500 A, respectively, the subscripts ‘int’ and ‘obs’ denote mean intrinsic and 
observed flux densities, (f(,500/fo00)int= 1.36 from our SED fits, (f,,500/fo00)obs = 5-79 
and Aj,500 = 1.58 mag is the internal extinction at 1,500 A (Fig. 4a). From this we 
obtain the relative escape fraction fescrel = 23.6%, and the absolute f.5-= 5.5%. The 
latter value is lower than our above determination for two reasons. First, equation 
(2) does not take into account the fact that the expected extinction at 900 A is 
higher than at 1,500 A. And second, the observed flux density fooo must be cor- 
rected for foreground extinction from the Milky Way. Therefore, fos: obtained with 
equation (2) is underestimated. 

Code availability. We have opted not to make the codes for the custom HST/ 
COS data reduction*! and the galaxy SED fitting”® available because they 
are not yet adapted for public use. The Starburst99 code used to generate spectral 


energy distribution of single stellar populations is available at http://www.stsci. 
edu/science/starburst99/docs/parameters.html. 
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Extended Data Figure 1 | The diagnostic diagram for narrow emission 
lines. The foundations of this diagram are given in ref. 29. The galaxy J0925+ 
1403 is shown by a large filled star, and the Luminous Compact Galaxies’? by 
small dark-grey circles. Also plotted are the 100,000 emission-line galaxies 
from SDSS DR7 (cloud of light-grey dots). The solid line*® separates star- 
forming galaxies (SFG) from active galactic nuclei (AGN). 
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Extended Data Figure 2 | SED fitting of the optical spectrum of 
J0925+1403. The rest-frame extinction-corrected spectrum is shown by a 


grey line. The stellar, ionized gas, and total modelled SEDs are shown by black 
dotted, dashed and solid lines, respectively. 
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Extended Data Figure 3 | A comparison of the observed ultraviolet and 
optical spectrum with the modelled SED. The observed spectrum is shown 
by a grey line. The total GALEX and SDSS photometric fluxes are represented 
by filled squares and filled circles, respectively, while the SDSS photometric 
fluxes within a round spectroscopic aperture of 3” diameter are shown by 
open circles. Modelled SEDs, which are reddened by the Milky Way with 
Rymw = 3.1 and internal extinction with different values of Ry,int, are shown by 
black lines. Dotted, dashed and solid lines correspond to Ryint = 3.1, 2.7, and 
2.4, respectively. 
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Extended Data Table 1 | Emission-line fluxes and equivalent widths in the optical spectrum 


Line Wavelength 100 x/(A)/(HB)* EWw(a)* 
(A) (A) 
[Oll] 3727 125.5+4.6 77 
H9 3835 9.1+2.0 7 
[Nelll] 3868 47.5419 33 
Hel+H8 3889 17.1+2.3 12 
[NellI]+H7 3968 32.0 + 2.6 22 
Hd 4101 28.7 + 2.3 22 
Hy 4340 43.7+2.5 36 
[OI] 4363 11.7+0.6 10 
Hel 4471 6.1+0.5 6 
HB 4861 100.0 + 3.6 177 
[Olll] 4959 199.9+6.7 306 
[OIII] 5007 608.1 + 20. 1174 
Hel 5876 11.3 +0.7 11 
[ol] 6300 46+05 i 
Ha 6563 280.2 + 9.9 732 
[NII] 6584 13.2+0.8 26 


tExtinction-corrected flux relative to the extinction-corrected flux (H3) =4.92 x 10-!5ergs~!cm~? of the H8 emission line, multiplied by 100. 


+Rest-frame equivalent width. 
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Extended Data Table 2 | Physical conditions and chemical composition 


Parameter Value 

Te(Olll), K 15010 + 410 
T(Oll), K 14010 + 360 
Ne(Oll), K 100° 

O*/H* x 10° 1.42+0.11 
O7*/H*x 10° 6.65 + 0.50 
O/H x 10° 8.06 + 0.52 
12+logO/H 7.91 + 0.03 
N*/H* x 108 1.12 + 0.07 
ICF(N)* 5.42 

N/Hx 10° 6.05 + 0.45 
log N/O -1.12 + 0.04 
Ne?*/H* x 10° 1.27+0.11 
ICF(Ne)* 1.08 

Ne/H x 10° 1.37 + 0.13 
log Ne/O -0.77 + 0.05 
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tAssumed value. 
#lonization correction factor. 
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Extended Data Table 3 | Global characteristics of JO925+1403 


Parameter Value 

lug" 49.2+1.3 
Redshift 0.301323 
Luminosity distance* 1620 

Lup (2.32 = 0.04) x 10% 
SFR* 52.2 

og 4.94 x 10% 
Qu(esc)” 3.86 x 10°° 
t(burst)” 2.6+0.2 

MyIMo (2.4 + 0.3) x 10° 
MidMo (8.2 + 0.7) x 10° 


tExtinction-corrected flux density in units of 10-16 erg s~! cm~2. 
#In units of Mpc. 

HExtinction- and aperture-corrected luminosity in units of erg s+. 
#Star-formation rate in Mo yr! derived from the H8 luminosity®°. 

*Qu and Qy(esc) are the numbers of Lyman continuum photons (in units of s~!) emitted by massive stars and escaped from the H I region, respectively. 
“Burst age in Myr. 
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Weakened magnetic braking as the origin of 
anomalously rapid rotation in old field stars 


Jennifer L. van Saders!*:3, Tugdual Ceillier*, Travis S. Metcalfe°, Victor Silva Aguirre®, Marc H. Pinsonneault*”, 


Rafael A. Garcia>*, Savita Mathur* & Guy R. Davies®® 


A knowledge of stellar ages is crucial for our understanding of 
many astrophysical phenomena, and yet ages can be difficult to 
determine. As they become older, stars lose mass and angular 
momentum, resulting in an observed slowdown in surface 
rotation!. The technique of ‘gyrochronology’ uses the rotation 
period of a star to calculate its age”*®. However, stars of known 
age must be used for calibration, and, until recently, the approach 
was untested for old stars (older than 1 gigayear, Gyr). Rotation 
periods are now known for stars in an open cluster of intermediate 
age* (NGC 6819; 2.5 Gyr old), and for old field stars whose ages 
have been determined with asteroseismology®®. The data for 
the cluster agree with previous period-age relations‘, but these 
relations fail to describe the asteroseismic sample’. Here we report 
stellar evolutionary modelling®®*!°, and confirm the presence of 
unexpectedly rapid rotation in stars that are more evolved than the 
Sun. We demonstrate that models that incorporate dramatically 
weakened magnetic braking for old stars can—unlike existing 
models—reproduce both the asteroseismic and the cluster data. 
Our findings might suggest a fundamental change in the nature 
of ageing stellar dynamos, with the Sun being close to the critical 
transition to much weaker magnetized winds. This weakened 
braking limits the diagnostic power of gyrochronology for those 
stars that are more than halfway through their main-sequence 
lifetimes. 

There are two approaches to the calibration and testing of gyrochro- 
nology. The first is a purely empirical approach, which uses a sample 
of stars with independently measured ages and rotation periods to 
construct period-age relationships. These relationships are gener- 
ally simple power laws that take into account age, period, and some 
mass-dependent quantity; they have seen wide usage’***”. The sec- 
ond, model-based approach uses stellar models and a prescription 
for magnetic braking to account for the functional dependence of the 
rotation period on all relevant stellar quantities, but relies on calibra- 
tors to determine the magnitude of the angular momentum loss. For 
this reason, the model-based approach is well suited to calibrating 
samples that cover parameter space only sparsely; it also provides a 
method for attaching physical meaning to observed braking behaviour. 

Magnetic-braking prescriptions are typically scaled from the solar 
case; for example, the Skumanich relation! yields angular momentum 
loss of the form dJ/dt« w*, where t is time, J is angular momentum, 
and w is the angular rotation velocity!!. These relations often use the 
dimensionless Rossby number—defined as the ratio of the rotation 
period to the convective overturn timescale, Ro = P/T.,—to character- 
ize departures from this simple power law. Rossby-number thresholds 
and scalings are routinely invoked to parameterize the magnetic-field 
strength’; the dependence of the spin-down on stellar mass and 


composition”"; the observed saturation of magnetic braking in rapid 
rotators; and the sharp transition from slow to rapid rotation that 
occurs in hot stars (of greater than 6,250 K) because of their thinning 
convective envelopes!*. Under traditional prescriptions, stars undergo 
braking throughout their main-sequence lifetimes, regardless of rota- 
tion rate. Observations of stellar clusters of young and intermediate 
ages have indicated that such treatments are reasonable*!>, However, 
there is a dearth of old stars with which to test such relationships, 
owing to the long-period, low-amplitude signatures of rotation in such 
stars, and to the challenge of age measurements in field stars. Data 
from the Kepler telescope provide a first test of these prescriptions in 
stars that are older than the Sun. 

The high-precision, long-baseline light curves from Kepler make 
such investigations possible. The rotation of a star manifests itself in 
Kepler data as a periodic modulation in the intensity, as dark starspots 
rotate into and out of view. Intensity variations due to stellar oscilla- 
tions are likewise present in the light curve, on shorter timescales. 
Low-degree modes of oscillation probe the conditions of the deep 
stellar interior and internal structure of the star, providing ages that are 
precise to better than 10% in stars for which many oscillation modes 
are detected at high signal-to-noise ratios’®. 

The first efforts to calibrate the gyrochronology relations using 
Kepler seismic targets uncovered tension between the cluster and 
seismic samples’. Although the form of the mass—period-age relation 
used in this study’ was similar to those in previous studies”, the range 
of ages and more sophisticated treatment of observational uncertain- 
ties made it possible to determine that the sample did not obey a sin- 
gle power-law period-age relation. However, even this approach has 
limitations: it does not account for metallicity or for changes in the 
stellar moment of inertia, and it relied on a sample for which detailed 
seismic modelling and spectroscopic data were lacking for some stars, 
biasing the seismic ages. 

To address the limitations of previous work and to take full advan- 
tage of precisely determined stellar parameters, we utilize a subset 
of 21 Kepler stars—selected to have detailed asteroseismic modelling 
and high-precision ages, measured rotation periods, and measured 
metallicities»** !°—and couple these observations to stellar evolu- 
tionary models. Sample selection, details of the modelling to derive 
asteroseismic ages, and extraction of rotation periods are described 
in the Methods. Figure 1 shows the surface—in terms of period, age 
and effective temperature (Tz, a proxy for mass)—upon which the 
stars are expected to lie**. Actual cluster and seismic data are over- 
plotted; while the clusters and young asteroseismic targets lie close to 
the plane, the intermediate-age and old asteroseismic stars are strik- 
ingly discrepant and nearly all lie below the surface, owing to the fact 
that they are rotating more rapidly than expected. When we account 
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Figure 1 | The period-age plane as predicted by gyrochronology, 
compared with observed periods. The empirical gyrochronology 
relation” is shown as a plane. Data from open stellar clusters are shown 
as small squares (NGC 6811 cluster; 1 Gyr) and triangles (NGC 6819 
cluster; 2.5 Gyr). Large circles represent the seismic sample of 21 stars that 
are detected in the Kepler data; this sample falls systematically below the 
plane. The solar symbol (©) marks the Sun, which falls on the plane by 
design. The effective temperature, Toi, is a proxy for mass. 


for uncertainties in the ages, masses, and compositions (see Methods) 
and predict the rotation periods that we should have observed given 
existing period-age relations”!4 (Pexpected), We find that the systematic 
offset persists; stars of roughly solar age and older are rotating more 
rapidly than predicted, regardless of the chosen period-age relation. 
Figure 2 highlights the systematic offset by plotting the ratios of the 
expected to observed periods for each star in the sample, where the 
expected periods are calculated using stellar models with a braking 
law calibrated on the Sun and on open clusters" (a similar plot is 
provided in Extended Data Fig. 3 for the empirical relation”). The 
theoretical models" fit the data with a x? value of 54.9, whereas the 
empirical relation’ yields a x? of 155.6. In both cases, the systematic 
offset towards short rotation periods is an indication that the models 
predict more angular momentum loss than actually occurs. 

We therefore conclude that magnetic braking is weaker in these 
intermediate-age and old stars. We extend our model by postulating 
that, in addition to the Rossby scaling already present in the theoretical 
models", effective loss of angular momentum ceases above a critical 
Rossby threshold'?. We modify the prescription for angular momen- 
tum loss!“ to conserve angular momentum above a specified Rowit- 
Graphs showing the effects of varying Ro¢it values on the models are 
provided in Fig. 3. The inclusion of the threshold has the desired effect: 
it reproduces the existing gyrochronology relations and cluster data 
at young ages, when Ro is smaller because of more rapid rotation, 
but allows stars to maintain unusually rapid rotation periods at late 
times. Furthermore, it reproduces the trend in mass that is apparent 
in Figs 2 and 3 (and the trend in the zero-age main-sequence (ZAMS) 
Tet Which selects stars with similar rotational histories; we perform 
all fits using the seismic mass, but use ZAMS Tyg for display to sim- 
plify the figures). Hotter, more massive stars reach the critical Rossby 
threshold at younger ages, and we therefore see discrepancies between 
the fiducial gyrochronology relationships and the observations at 
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Figure 2 | Ratios of the predicted rotation period” to the observed 
period. Predicted rotation periods are derived from existing period- 
age relations; observed periods are as detected by the Kepler telescope. 
These ratios are plotted against stellar age. Stars are divided according to 
decreasing ZAMS T.¢: a, 5,900-6,200 K; b, 5,600-5,900 K; c, 5,100-5,400 K. 
Period ratios for open clusters are shown as black symbols, as follows: 
diamonds, M37; circles, Praesepe; squares, NGC 6811; triangles, NGC 
6819. The Sun (@) is also marked. Coloured circles represent seismic 
targets; coloured triangles represent known planet hosts; coloured squares 
represent the binary stars 16 Cygni A and B. All errors are shown to lo. 
Stars are coloured according to ZAMS T.,, with blue representing the 
hottest stars and red the coolest stars. Shaded regions represent the period 
ratios permitted in each To bin for a model in which Rogit = 2.16. 


earlier times as ZAMS T.increases. The best-fit value for the Rossby 
threshold, given our sample, is Rogit =2.16 + 0.09 (y? = 13.3) for the 
modified models. The shaded grey regions in Figs 2 and 3 denote the 
full range of period ratios (Pocrit/Pfiducial)s and the period—age com- 
binations allowed for a model with Ro,it=2.16, given the ranges of 
ZAMS Tf, that are represented in each panel. These regions encom- 
pass all combinations of mass (0.4-2.0 solar masses) and metallicity 
(—0.4 < [Z/H] < +0.4) that together produce a star within the appro- 
priate ZAMS Tv. range for each panel of Figs 2 and 3, on both the 
main-sequence and the subgiant branch. 

We emphasize that our result—that old stars are rotating anom- 
alously rapidly—persists regardless of the choice of period-age 
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Figure 3 | The effects of a Ro,it threshold on rotational evolution. Panels 
are divided according to decreasing ZAMS Tor: a, 5,900-6,200 K; b, 5,600- 
5,900 K; c, 5,100-5,400 K (as in Fig. 2). Black symbols represent open 
stellar clusters, as follows: diamonds, M37; circles, Praesepe; squares, NGC 
6811; triangles, NGC 6819. The Sun (©) is also marked. Model curves 
are shown for solar metallicity and ZAMS Ter 6,050 K (a), 5,750 K (b), 
and 5,250 K (c). Curves are colour-coded by Rogit: black, no Rog it cut; 
dark blue, Rogrit = 1.0; light blue, Rocrit = 1.5; green, Rocrit = 2.0; orange, 
Rog it = 2.53 red, Rog it = 3.0; dashed black, Rog it = 2.16. Successive curves 
are offset by +0.1 Gyrs to improve readability. Seismic (cluster) targets 
are overplotted in solid (open) symbols with 1c errors. Shaded regions 
represent Rocrit = 2.16 models for each Ter range. 


relationship, asteroseismic modelling pipeline, or model uncertain- 
ties from the literature (see Methods). The period-detection algo- 
rithms!” and seismic ages have been well tested®. The tight rotational 
sequences observed in intermediate-age open clusters* suggest that 
we are not simply detecting the rapidly rotating tail of a population 
with a wide distribution of rotation rates, and it is unlikely that our 21 
stars with detected rotation rates are atypical (see Methods for further 
discussion). 

Our model represents the limiting case in which the braking is so 
ineffective that the star ceases to shed angular momentum. If we instead 
allow the exponent, a, of the period-age relation Px t!/“ to vary, while 
fixing Rog to the solar Ro value of 2.16, we do not obtain a comparable 
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fit in the old stars until a is greater than ~20, suggesting that the braking 
is indeed drastically reduced. However, we do observe spot modulation 
in these stars, which implies at least small-scale magnetic activity. The 
starspot properties may or may not directly reflect changes in the large- 
scale magnetic field that governs spin-down. A change in field geometry 
from a simple dipole to higher-order fields could produce weakened 
braking'*””, as could a change in the distribution of spots on the stellar 
surface”. It could also be the case that the large-scale field strength 
undergoes a transition at high Rossby numbers’”. Abrupt changes in 
the efficiency of angular momentum loss have been proposed in order 
to explain the rotational distributions in young clusters’, and there is 
evidence for a Rossby-number-governed shift in field morphologies in 
low-mass M dwarfs*’. Observations of detailed magnetic-field mor- 
phologies and corresponding simulations are lacking for stars at higher 
Rossby numbers than the Sun, and both are critical to understanding 
the source of the observed anomalous rotation. 

Regardless of the mechanism that governs the spin-down, the 
observation that existing rotation—age relationships do not predict the 
observed rotation rates has immediate implications for gyrochronol- 
ogy. The rotation periods of the middle-aged stars that have passed this 
Rossby threshold represent only lower limits on the age. The empir- 
ical calibrations must be modified, and the weakened relationship 
between period and age will result in substantially more uncertain 
rotation-based ages for stars in the latter halves of their lives. The pres- 
ence of such a Rossby threshold defines boundaries in mass—age space, 
past which gyrochronology is incapable of delivering precise ages. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Sample selection. Our sample can be divided into two principal target types: 
Kepler Asteroseismic Science Consortium (KASC) targets, and Kepler Objects 
of Interest (KOIs). We focus on those stars with (modelled) ZAMS Trp values 
(defined as the point at which hydrogen fusion dominates the stellar luminosity) 
below 6,200 K, where magnetic braking should be most important. We show the 
positions of the selected stars on a Hertzsprung—Russell diagram in Extended Data 
Fig. 1, and period—age plot in Extended Data Fig. 2. 

As described elsewhere’, the asteroseismic sample is drawn from a magnitude- 
limited sample of 2,000 Sun-like stars that were selected for a one-month 
period of short-cadence (~1-minute) Kepler observations on the basis of their 
properties in the Kepler Input Catalog (KIC). Of these stars, roughly 500 dis- 
played evidence of solar-like oscillations. A subset of targets with detections of 
oscillations that show high signal to noise ratio detections of oscillations were 
selected for continued monitoring over Kepler quarters 5-17. Of this sam- 
ple, the mode frequencies for a subset of 61 high-signal-to-noise stars were 
extracted; there are high-resolution spectroscopic data for 46 of these. We 
modelled 42 of these 46 stars with the asteroseismic modelling portal (AMP, 
described below), excluding 4 targets whose spectra contained a complicated 
pattern of mixed modes. Of the 42 modelled targets, 11 were both detected in 
spot modulation and classified as ‘simple’ solar-like oscillators that did not show 
the seismic hallmarks of F-stars and evolved subgiants. A further three (non- 
overlapping) targets were added’. Of this sample of 14, 12 targets have AMP ZAMS 
Tes Values of less than 6,200 K, yielding a total of 12 stars in the KASC sample. 

The KOI sample? was selected from the 77 KOIs observed in short cadence that 
displayed signatures of solar-like oscillations. Of these, 35 power spectra were of 
sufficient quality to extract individual mode frequencies to be modelled, 33 of which 
represent unevolved main-sequence stars. A subset of 11 have periods detected via 
spot modulation®, 7 of which have an AMP ZAMS Tr of less than 6,200 K. 

Finally, we add the two well studied stars from the 16 Cygni binary to our sam- 

ple; for these stars, asteroseismic ages!© and rotation periods have been inferred 
from asteroseismic mode splittings’. In total, 21 stars are addressed in this analysis. 
Where available, we use the updated asteroseismic frequencies of ref. 24. Extended 
Data Table 1 shows the seismic (mass, age) and spectroscopic (Tei, [Fe/H]) values 
and rotation periods for these stars. 
Age and period measurements. Asteroseismic ages are determined using two 
methodologies: AMP, which provides the ages used in most of this paper; and the 
Bayesian stellar algorithm (BASTA) pipeline, used to verify that the discrepancies 
in predicted and observed rotation periods are not the result of pipeline choice. 
AMP uses a genetic algorithm to perform a search for the global y? minimum 
between the stellar observables and stellar model values’. The algorithm uses the 
Aarhus stellar evolution code (ASTEC) and adiabatic pulsation code (ADIPLS) to 
compute oscillation frequencies. The BASTA pipeline uses a Bayesian approach 
to model stars with a grid of models produced with the Garching stellar evolution 
code (GARSTEC). The input physics of the stellar models used in each method 
are detailed in refs 8-10. 

Both methods use frequency spacings and spectroscopic constraints to identify 
the optimal stellar properties, but AMP also uses the individual frequencies by 
employing an empirical correction for surface effects. There are two main dif- 
ferences between the models used by BASTA and those used by AMP. BASTA- 
GARSTEC uses a fixed relationship between the initial helium and metallicity, 
anchored to zero metallicity at the primordial helium abundance and assuming 
AY/AZ= 1.4 to reproduce the solar values (Y is the mass fraction of helium and 
Zis the mass fraction of all other elements excluding hydrogen and helium). It also 
uses a single solar-calibrated value of the mixing-length parameter for all mod- 
els. AMP-ASTEC allows the initial helium to float independently of metallicity, 
and searches a wide range of values for the mixing-length parameter. Both sets 
of models include diffusion, although BASTA-GARSTEC includes both helium 
and heavy-metal diffusion, while AMP-ASTEC considers only helium diffusion. 

We extract rotation periods using techniques? that we summarize briefly here 
(full period-extraction diagrams are available at http://irfu.cea.fr/Phocea/Vie_ 
des_labos/Ast/ast_technique.php?id_ast=3607). For the corrected light curve of 
each Kepler star, the autocorrelation function (ACF) and a wavelets decomposition 
(period-time) are calculated. We collapse the wavelet decomposition on the period 
axis to obtain the global wavelet power spectrum (GWPS), and the peaks of this 
GWPS are fitted with gaussian functions. In parallel, we identify the peaks of the 
ACE. The derived surface rotation period is the result of the comparison of the ACF 
and GWPS analyses and is confirmed by a visual inspection of the light curves. 
Stellar rotation models. We use a theoretical model grid'4 (using OPAL rather 
than OP opacities; all other inputs are unchanged), utilize the same loss-law cali- 
bration and form as in ref. 14, and assume solid-body rotation. The model grid is 
expanded to cover a wider range of metallicities and masses, namely [Z/H] = —0.4 
to [Z/H] = +0.4, assuming a helium enrichment of AY/AZ= 1.0 and no diffusion 
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or gravitational settling. We use the ‘fast launch’ conditions" for modelling the 
rotation, but have validated that our results are insensitive to the choice of initial 
conditions. Changing the launch conditions typically shifts the period ratio (in 
sense of expected/observed) by less than 50% of the quoted errors, and shifts the 
fitted critical Rossby number to Rogit = 2.15 + 0.08. The model 7;, is the local 
convective overturn timescale, defined as the ratio of the typical mixing length to 
the convective velocity at one pressure scale height above the base of the convective 
envelope in the mixing length theory of convection. Under this definition, the solar 
rotation period (Pg) is 25.4 days, Tez, is 1.015 x 10° s, and Rog =2.16. 

The weakened magnetic braking is modelled by modifying the braking law such 
that a star with P/T., > Rocit is evolved under the assumption of conservation of 
angular momentum, such that the rotation period depends only on the changing 
moment of inertia of the star as it evolves. The modified loss law is given by the 
following equations (based on equations (1) and (2) in ref. 14): 


2 
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where fx is a constant factor used to scale the loss law during the empirical fitting; 
Werit is the saturation threshold (important only at young ages); 7, is the convective 
overturn timescale; Pphot is the pressure at the photosphere; R is the radius; M is the 
mass; L is the luminosity; and © refers to the Sun. The term c(w) sets the centrifu- 
gal correction; because our stars are slowly rotating and the correction should be 
small, we set c(w) to a constant value of 1. This braking law is fit to open-cluster 
data and the Sun, where the initial rotation period, disk-locking timescale, weit, 
and fx were allowed to vary, and all other parameters were determined using stellar 
evolutionary models!*, When fitting for an optimal Rocit, we keep the parameters 
of the magnetic braking law calibrated on the Sun and open clusters fixed, and 
vary only the Rogit at which braking is allowed to cease. Rogrit is optimized using 
a x” figure of merit (valid under the assumption of independent observations and 
Gaussian uncertainties): 1? = SA\(Pobsi—moa,i)"/(Cobs,? + Fmodi-)» Where Oobs,i iS 
the observational uncertainty on the extracted period, and omoa,; represents the 
uncertainty on the model period given the uncertainties on the input masses, ages, 
and compositions. We derive uncertainties on Rocit using bootstrap resampling, 
drawing a 21-star sample with replacement from the original data 50,000 times, 
and recalculating the best-fit Rogit for each realization. Cluster data and the Sun 
are not used in this fit. An alternate fit allowing parameters important for late-time 
braking to vary (fx, Rogrit) and including intermediate-age and older rotation data 
from the seismic sample, NGC 6819, and the Sun (52 stars in total) yields a best-fit 
Rocrit of 2.1 + 0.1, with fx = 8.4 £0.2. 

Predicted model periods are obtained by using the mass and age from the 

asteroseismic pipelines coupled with the spectroscopic metallicity>-!°!°, Model 
uncertainties are estimated by generating 50,000 (20,000 for Rogrit + fx fit) reali- 
zations of the input parameters (M, t and [Fe/H]), where values are drawn from a 
Gaussian distribution centred on the observed value, with 1c errors defined by the 
observational uncertainties. While we search in the fundamental space of mass, 
age and composition, we select only models which fall within 5c of the observed 
Ter. This constraint has little or no effect for unevolved stars, but ensures that 
stars at the turnoff (KIC 6196457 and KIC 8349582 in particular) are not assigned 
artificially long rotation periods due to mass—age combinations that fall on the 
subgiant branch. 1o uncertainties on the model periods are defined as the values 
that enclose 68% of the resulting models. 
Empirical gyrochronology relations. We verify that the unexpectedly rapid rota- 
tion in old, solar-like stars is independent of the spin-down prescription by repeat- 
ing our exercise with an empirical literature gyrochronology relation’. We replicate 
Fig. 2 in Extended Data Fig. 3 with predicted periods drawn from an empirical 
gyrochronology calibration, based on equation (32) in ref. 2: 


t=— | z J+ Ki (p2_ p2y 

Pi 2T 
where f is the age, 7 is the convective overturn timescale, P is the period, and Pp is 
the initial period. We adopt values for kc of 0.646 million years (Myr) per day and 
for ky of 452 days per Myr; Pp=1.1 days, and the global 7— Tei relation is as used in 
refs 2,4. 50,000 realizations of the combination (Tor, f) are drawn from a Gaussian 
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distribution centred on the measured values, with a 1o width defined by the quoted 
observational errors on the central values. These empirical relationships do not 
account for the physical expansion of stars as they evolve (particularly near the end 
of the main sequence) and therefore tend to predict somewhat more rapid rotation 
than do full theoretical models near the main sequence turnoff. 

Cluster data. To provide comparison with the typical gyrochronological calibra- 
tors, we draw cluster data from a variety of literature sources. For the cluster M37 
we adopt the following cluster parameters’®: extinction, E(B—V) = 0.227 + 0.038 
mag; total metal abundance [M/H] = 0.045 + 0.044 dex; age = 550+ 30 Myr. 
Rotation data and cluster parameters” for Praesepe (M44) are included, with 
E(B— V)=0.027 + 0.004, [Fe/H] = 0.11 + 0.03, and log(age) = 8.77 + 0.1. For 
NGC 6811, we adopt the g-r colours, E(B—V) value of 0.1, and rotation periods 
as in ref. 26, as well as the [M/H] value of —0.1 £0.01 and age of 1.00 + 0.05 Gyr 
from ref. 27. Finally, for NGC 6819 we use the rotation periods and B— V colours 
from ref. 4, with the age (2.5 + 0.2 Gyr) and adopted metallicity (0.09 + 0.03) from 
ref. 28. B—V colours are converted into temperatures and stellar masses using Yale 
rotating stellar evolution (YREC) isochrones”. We model cluster stars in the same 
manner as the seismic targets, with 10,000 mass—age—composition realizations 
for each star. We display the mean cluster rotation periods for all stars within the 
ZAMS Ts bins, with errors representing the 16th and 84th percentiles. In M37 and 
Praesepe in particular, the rotational distribution displays a range resulting from 
spread in the initial rotation periods. 

Sample biases. We demonstrate that our results are unlikely to be a consequence of 
selection bias in our sample. The sample is subject to two sources of selection bias: 
asteroseismic detectability, and the detectability of spot modulation. 

Detailed asteroseismic analysis requires a high signal-to-noise detection 
of the power excess from oscillations. Oscillation amplitudes scale roughly as 
Ammax « (L/M) (Teg)? (equation (7) from ref. 30, referring to 1=0 radial modes); 
seismic samples are therefore strongly biased towards more-massive stars. There 
is also a bias towards bright targets, where lower noise levels contribute to detect- 
ability. Our sample is drawn from two subsets of stars: the one-month survey stars 
from the seismic sample, and the KOIs. We expect the one-month survey seismic 
detections at magnitudes of K, of less than ~10, while the roughly 1,000-day time 
series in short cadence collected for the KOI sample allow detections out to K,~ 12, 
which well describes the actual magnitude distribution of our sample (see Fig. 6 
in ref. 30). The strong trends with magnitude and mass are well predicted by basic 
scaling arguments, save for the dependence on activity: active stars are less likely 
to be detected in oscillations*®. Our sample is selected seismically, and we do not 
expect the well understood seismic biases to favour rapid rotators (apart from the 
obvious mass dependence). 

Variability owing to starspots scales with the rotation period, in the sense that 
more rapid rotation is associated with higher amplitudes of variability'®. One could 
imagine that we are detecting the rapidly rotating tail of a distribution of rotation 
periods, or detecting objects spun up by binary/planetary interactions or mergers. 

This first case is at odds with what we know from open clusters: as late as 2.5 Gyr, 
there is a converged, well defined rotational sequence that shows very little scatter 
at fixed mass’. If we are in fact detecting a rapidly rotating subset of the population, 
the dispersion in rotation and spin-down rates must set in after several billion 
years, or it would be visible in the open-cluster data. If there is dispersion in the 
rotation periods, it represents a serious challenge to the validity of gyrochronology 
for old stars, regardless of its source. 

The pipeline used to extract the rotation periods for this work has been tested 
with an injection and recovery exercise!’”. Our recovery fraction is shown in 
Extended Data Fig. 4, and demonstrates that we should be able to detect stars that 
are substantially less active than the Sun at longer periods. However, this exercise 
does not account for stars that simply cease to have spots to detect on their sur- 
faces; under this scenario, slow rotators could exist but be undetectable. We cannot 
directly combat this concern given our current data set, although we can examine 
the case of 16 Cygni (16 Cyg). 16 Cyg A and B are not detected in spot modulation; 
their periods are derived from asteroseismic frequency splittings, which yield peri- 
ods that probe the envelope rotation”*. If we assume that these stars have solar-like 
rotation profiles, then the seismic rotation periods are directly comparable to the 
surface periods. This pair displays the same anomalously rapid rotation as objects 
detected in spot modulation, providing evidence against the argument that stars 
undetected in spot modulation are simply more slowly rotating. It is also worth 
noting that our own Sun would be undetectable during the minimum of its activity 
cycle (see ref. 17). Our non-detections could equally be the result of the normal 
variations in the activity of Sun-like stars, rather than a period bias. 

Finally, we examine the possibility of interactions or mergers with other bod- 
ies. In our sample, 16 Cyg A and B, and KIC 3427720 and KIC 9139151, are 


known or suspected binaries’. In each case, the components are well separated, 
and the binary orbits are estimated well in excess of 10,000 years. In order for a 
companion to affect rotation considerably, it must be at orbital periods compa- 
rable to the rotation period, and will therefore be unresolved. The KOI sample 
has undergone the extensive vetting that is associated with planet detection; 
all planets are confirmed, and there is no evidence of transit timing variations 
that would accompany a close stellar companion. System stability is unlikely 
for binary orbits of the order of 30 days that contain even a low-mass stellar 
companion*!. Likewise, there is no evidence for interaction between the planets 
and the host stars in the KOI sample®, and no known hot Jupiters. In the case of 
the seismic sample, there is no evidence for double-lined binaries, photometric— 
spectroscopic temperature disagreements, multiple oscillating components, 
or unusual dilution of the seismic power spectra, and no evidence of eclipses. 
Finally, if mergers (planetary or stellar) were responsible for all detections of 
rapid rotation, then the 50% detection rate of the ‘simple stars’”” in spot modula- 
tion implies an uncomfortably high merger rate. 

The asteroseismic age scale. We carry out two tests to demonstrate that the dis- 
crepancy between the expected and observed rotation periods is not due to a sys- 
tematic bias in the ages with roots in the asteroseismic age scale. We show that 
ages derived with the BASTA pipeline display the same trend in rotation period, 
and that systematically shifting the asteroseismic ages, while improving the fit, is 
inferior to instituting a Rossby threshold. 

Extended Data Fig. 5 provides period ratio plots using the BASTA ages and 
BASTA ZAMS Ty; determinations. The systematic trend in the period ratios sur- 
vives. The Barnes relation? fits with y? = 184.3, and the fiducial models!* with 
\? = 68.4. A fit for Rogit using the BASTA ages yields Rogit= 2.67 + 0.50. Bootstrap 
resampling demonstrates that this number is sensitive to whether KIC 8349582 is 
drawn; if KIC 8349582 is excluded, the fit becomes Rog it = 2.12 £0.12. 

We also investigate the possibility that the seismic age scale is systematically 
shifted relative to the true ages. We perform model fits with the fiducial braking 
law with an extra parameter that allows for a systematic age shift. For the AMP 
ages, \’ is minimized with the Barnes relation with a systematic shift of 35% and 
ay” of 78.5. Likewise, the fiducial models! prefer a shift of 20 + 3% with a y? of 
26.9. In both cases, the required systematic shifts are larger than the estimated 9.6% 
systematic uncertainties in seismic ages!”. 

Finally, to verify that we are not biased by the fact that the ages and periods were 

determined using different evolution codes, we tune the physics in the fiducial 
models" to match that of the AMP models, and predict the rotation periods for the 
central AMP values of the masses, ages, and compositions of each star. In particular, 
we match the diffusion physics, opacity tables, equation of state, helium and metal 
abundances, boundary conditions, and important nuclear reaction rates present in 
the ASTEC code used for AMP. The results are presented in Extended Data Fig. 6, 
and demonstrate that the discrepancy between the predicted and observed periods 
is preserved. We conclude that our result is not the consequence of assumptions 
about the stellar physics included in models. 
Code availability. The AMP science code used to infer stellar ages can be down- 
loaded at https://amp.phys.au.dk/about/evolpack. Code for the period extraction 
and rotational evolution will be publicly released upon completion of the necessary 
documentation. YREC likewise has no public documentation, and has not been 
publicly released. BASTA is undergoing major revisions for increased speed and 
is not yet publicly available. 
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Extended Data Figure 1 | The positions of all 21 Kepler stars on the 
Hertzsprung-Russell diagram. We plot spectroscopic Terr (a proxy 

for mass) versus seismic log(g) (surface gravity), with 1o observational 
error bars; the symbol size is proportional to the period ratio (AMP ages, 
fiducial models"). Colours and symbol conventions are as in Fig. 2. 
Evolutionary tracks are overplotted for [Z/H] = +0.3 (dotted lines) and 
[Z/H] = —0.1 (solid lines), for masses 0.8-1.3 Mo in increments of 0.1 Mo. 
({Z/H] = +0.3, M=0.8 Mg is beyond the plotted area.) 
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Extended Data Figure 2 | Period-age plot of sample stars. The 21-star sample, with observed rotational periods plotted against AMP asteroseismic 
ages. Symbol conventions are as in Fig. 2. The solid line denotes the empirical relation? for Te= 5,800 K (approximately equal to the mean sample Tor). 
All error bars represent lo. 
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Extended Data Figure 3 | Period ratios using empirical gyrochronology 
relations. Ratios of predicted periods’ to observed periods are plotted as 
a function of the AMP asteroseismic age, and divided according to AMP 
ZAMS Tere (a, ZAMS Topp = 5,900-6,200 K; b, ZAMS Tepe = 5,600-5,900 K; 
c, ZAMS Ter¢= 5,100-5,400 K.) Error bars represent 1a. Symbol 
conventions are as in Fig. 2. 
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Extended Data Figure 4 | Detectability of stars in spot modulation. light curves searched for periodicity in each cell is overplotted. The dashed 
Detection fractions for the 750 stars with noise in the hound-and-hare black line at P= 35 days represents the expected period for stars like the 


exercise of ref. 17, as a function of activity level A (where the activity level Sun under traditional gyrochronology relations found in the literature. 
of the Sun is defined as Ag = 1) and rotation period. The total number of 
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Extended Data Figure 5 | Predicted versus observed rotation periods 
using ages determined with BASTA. a, c, e, Plotted are the ratios of the 
periods predicted using the fiducial models" to the observed rotation 
periods, as a function of stellar age. The grey band represents the offset 


expected from models in which Rogrit = 2.16. All error bars represent 1c. 
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b, d, f, Ratios of the predicted periods obtained from the empirical 
relation” to the observed periods, plotted against stellar age. Stars are 
divided according to ZAMS Tar, using BASTA ZAMS Tar; values: a, b, 
5,900-6,200 K; c, d, 5,500-5,900 K; e, f, 5,100-5,400 K. All symbol 


conventions are as in Fig. 2. 
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Extended Data Figure 6 | The shift in the period ratios induced by 
changing the stellar model input physics. Circles are colour-coded 
according to ZAMS Terr as in Fig. 2. Ratios of the periodicity expected 
from the fiducial model" to the observed periodicity are plotted against 
age. Arrows denote the shift in the period ratio that occurs when YREC 
models’ are run to match the AMP-ASTEC physics. 
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Extended Data Table 1 | Rotation periods, asteroseismic data and spectroscopic quantities for sample stars 


AMP. BASTA/GARSTEC Spectroscopic 
KIC Mass Age log(g) ZAMS Mass Age ZAMS Tetf [Fe/H] Period Note 
Teff Teff 

16Cyg A 1.10+0.02 7.0740.46 4.295 5677 1.04+0.01 6.95+0.26 5668 5825+50 +0.09+0.02 23.8+1.7 seismic period 
16CygB 1.06+0.02 6.82+0.28 4.360 5629 0.998+0.005 7.02+0.14 5592 5750+50 +0.05+0.02 23.2+7.4 seismic period 
3427720 =. 1.13+0.04 2.2340.17 4.388 5985 1.12+0.02 2.22+0.31 6019 6040+84 -0.0340.09 13.9+2.1 seismic 
3656476 = 1.17+0.03 8.1340.59 4.246 5642 1.07+0.01 7.6840.42 5525 5710484 +0.25+0.09 31.743.5 seismic 
5184732 1.2740.04 4.17+40.40 4.270 5905 1.1840.02 4.05+0.42 5810 5840484 +0.38+0.09 19.8+2.4 seismic 
6116048 1.0140.03 6.2340.37 4.270 5838 1.06+0.02 5.54+0.34 5943 5935484 -0.2440.09 17.3+2.0 seismic 
6196457 = =1.23+0.04 5.51+0.71 4.053 6064 1.2140.02 5.52+0.50 5991 5871494 +0.17+0.11 16.441.2 KOI 
6521045 1.0440.02 6.244+0.37 4.118 5933 1.1140.02 6.50+0.51 5886 5825+75 +0.02+0.10 25.342.8 KOI 
7680114  1.1340.03 7.19+0.70 4.184 5801 a o - “=o 5855+84 +0.1140.09  26.341.9 seismic 
7871531 =0.84+0.02 9.15+0.47 4.479 5253 0.8440.02 10.10+0.99 5240 5400+84 -0.24+0.09  33.74+2.6 seismic 
8006161 1.04+0.02 5.04+0.17 4.502 5165 0.948+0.005 5.084+0.10 5250 5390484 +0.34+0.09 29.843.1 seismic 
8349582 1.19+0.04 7.9340.94 4.178 5695 1.07+0.02 8.0340.75 5630 5699+74 +0.30+0.10 51.0+1.5 KOI 
9098294 1.00+0.03 7.28+0.51 4.314 5718 1.0140.02 6.9340.57 5734 5840+84 -0.1340.09 19.8+1.3 seismic 
9139151 = 1.14+0.03 1.7140.19 4.376 6092 1.16+0.02 1.79+0.46 6019 6125+84 +0.11+0.09 11.0+2.2 seismic 
9955598 0.96+0.01 6.4340.47 4.506 5307 0.89+0.01 6.98+0.45 5250 5460475 +0.08+0.10 34.7+6.3 KOI 
10454113 1.19+0.04 2.0340.29 4.315 6138 1.15+0.03 2.86+0.54 6095 6120+84 -0.06+0.09 14.6+1.1 seismic 
10586004 1.16+0.05 6.354+1.37 4.072 5943 1.184+0.03  6.4340.62 5753 5770483 +0.29+0.10 29.8+1.0 KOI 
10644253 1.1340.05 1.0740.25 4.402 6001 1.16+0.02 1.2040.39 5991 6030484 +0.12+0.09 10.91+40.87 seismic 
10963065 1.07+0.02 4.3640.46 4.294 6063 1.09+0.02 4.18+0.44 6076 6104474 -0.2040.10 12.4+1.2 KOI 
11244118 1.10+0.05 6.4340.58 4.077 6023 1.13+0.02 6.90+0.44 5677 5745484 +0.35+0.09 23.243.9 seismic 
11401755 1.03+0.05 5.85+0.93 4.043 6094 1.0640.03 7.10+0.60 6057 5911466 -0.2040.06 17.2+1.4 KOI 


Units are as follows: mass, solar masses; age, Gyr; log(g), g cm~®, Ter, K; period, days. Quoted errors are given to 1c. 
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Controlling many-body states by the electric-field 
effect in a two-dimensional material 


L. J. Li}?:3*, E. C. T. O’Farrell?*, K. P Loh!3, G. Eda, B. Ozyilmaz)? & A. H. Castro Neto! 


To understand the complex physics of a system with strong 
electron-electron interactions, the ideal is to control and monitor 
its properties while tuning an external electric field applied to 
the system (the electric-field effect). Indeed, complete electric- 
field control of many-body states in strongly correlated electron 
systems is fundamental to the next generation of condensed matter 
research and devices!~*, However, the material must be thin enough 
to avoid shielding of the electric field in the bulk material. Two- 
dimensional materials do not experience electrical screening, 
and their charge-carrier density can be controlled by gating. 
Octahedral titanium diselenide (1T-TiSe2) is a prototypical two- 
dimensional material that reveals a charge-density wave (CDW) 
and superconductivity in its phase diagram‘, presenting several 
similarities with other layered systems such as copper oxides’, 
iron pnictides®, and crystals of rare-earth elements and actinide 
atoms’. By studying 1T-TiSe, single crystals with thicknesses of 
10 nanometres or less, encapsulated in two-dimensional layers of 
hexagonal boron nitride, we achieve unprecedented control over the 
CDW transition temperature (tuned from 170 kelvin to 40 kelvin), 
and over the superconductivity transition temperature (tuned from 
a quantum critical point at 0 kelvin up to 3 kelvin). Electrically 
driving TiSe2 over different ordered electronic phases allows us 
to study the details of the phase transitions between many-body 
states. Observations of periodic oscillations of magnetoresistance 
induced by the Little-Parks effect show that the appearance of 
superconductivity is directly correlated with the spatial texturing of 
the amplitude and phase of the superconductivity order parameter, 
corresponding to a two-dimensional matrix of superconductivity. 
We infer that this superconductivity matrix is supported by a matrix 
of incommensurate CDW states embedded in the commensurate 
CDW states. Our results show that spatially modulated electronic 
states are fundamental to the appearance of two-dimensional 
superconductivity. 

The charge-carrier density—or equivalently, the Fermi energy— 
strongly controls phase transitions in correlated systems. Traditionally, 
charge-carrier density can be controlled by doping, that is, by chem- 
ical modification of the material. Unfortunately, the alteration of the 
system’s chemical composition leads to the unavoidable introduction 
of disorder. In strongly correlated systems, owing to their exponen- 
tial sensitivity to the local electronic environment, disorder can have 
a profound impact that masks the intrinsic many-body behaviour®. 
Hence, there is a growing need to change the charge-carrier density of 
strongly correlated systems without chemical means. The application 
of an electric field is one of the ‘cleanest’ ways (that is, it tends not to 
introduce disorder) to address many-body states because it is intrinsi- 
cally homogeneous. However, electric fields are screened by the bulk 
material in three-dimensional metals, making their use difficult. 

The Fermi energy not only controls the number of electric carriers 
(electrons or holes) but also the screening of external electric fields 


and internal electron-electron interactions’. In two-dimensional (2D) 
systems the electrons move in a plane while the electric field propagates 
in three-dimensional space. Hence, 2D electrons are unable to screen 
electric fields, external or their own. Therefore, we chose to work with a 
2D material, TiSe2, of nanometre-scale thickness, and we used an ionic 
gel electrolyte gate to apply the electric field. In addition, the flake of 
TiSez was encapsulated by a 2D dielectric, hexagonal boron nitride, to 
avoid external disorder and chemical oxidation and degradation caused 
by both air and the electrolyte. 

Electrical transport measurements under electric-field-induced 
doping enabled us to construct the phase diagram shown in Fig. 1. 
Electron doping suppresses the CDW transition from 170 K to 40K 
and superconductivity appears with a dome that peaks at 3 K. We show 
that the emergence of superconductivity is directly associated with the 
inhomogeneous electronic states that correspond to a periodic struc- 
ture of the amplitude and phase shifts of the superconductivity order 
parameter. This periodic structure must be stabilized and pinned to 
the lattice, so we can infer the presence of an incommensurate CDW 
(ICDW) matrix surrounding commensurate CDW (CCDW) regions. 

TiSe, nanosheets with thicknesses of 10nm or less were prepared by 
mechanical exfoliation of a high-quality single crystal (Extended Data 
Fig. 1). The device fabrication and measurement details are described 
in Methods and Supplementary Information. In Fig. 2a we sketch the 
electric-field double layer transistor device used in our experiments; 
Fig. 2b shows a typical top-gate sweep at 285 K and the variation of 
the electron density as measured by the Hall effect (see Methods and 
Extended Data Fig. 2a, b). Using an electrolyte top gate and an elec- 
trostatic doped-Si bottom gate we could control the electron density 
up to about 10! cm~? and thereby explore the phase diagram of this 
2D material. 

Variation of the charge-carrier density n leads to strong variations 
of the sheet resistance Rs of the device, as shown in Fig. 2c, d. At low 
charge-carrier densities, one can clearly see a peak in the resistivity 
versus temperature. The CDW transition temperature”, Tcpw, corre- 
sponds to the inflection point of the resistance and was also measured 
by using the Hall effect (Extended Data Fig. 2c), to detect the recon- 
struction of the Fermi surface. On increasing the charge-carrier density, 
Tcpw decreases from 170 K to 40 K before becoming undetectable at 
around n=7.5 x 10!4cm~. 

On increasing the electron density we observe the superconductivity 
state, as shown in Fig. 2d. The superconductivity transition temper- 
ature, Tc, increases from 0 K at the quantum critical point (QCP) at 
n=1.2 x 10'4cm~” up to approximately 3 K at an optimal density of 
n=7.5 x 10'4 cm. We note that this is exactly the density at which the 
CDW signal vanishes, indicating a scenario of two competing orders’. 
A further increase in density suppresses To, giving rise to the forma- 
tion of a superconductivity dome, as shown in Fig. 1 together with 
representations of the inferred structure in each region of the phase 
diagram (discussed below). 
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Figure 1 | Phase diagram of TiSe, under electron doping. Circles show 
Tx_r and squares show Tcpw. The insets show the lattice structure in each 
phase. In the CDW phases we illustrate the atomic displacements within 

an enlarged unit cell; in the phase in which ICDW and CDW coexist we 
schematically illustrate the ICDW domain walls between the CCDW regions 
as the red region (which we have exaggerated to occupy a single unit cell 
instead of a few nanometres). The error bars define the difference in Tcpw 
between the values derived from resistivity versus temperature curves and 
those derived from charge-carrier density versus temperature curves. 


When the superconducting coherence length €(T) becomes larger 
than the sample thickness close to Tc we expect the material to behave 
as a 2D system and the superconducting transition is anticipated to be 
of the Kosterlitz~Thouless (K-T) type with vortex—antivortex unbind- 
ing'?. One of the trademarks of the K-T transition is the broadening 
of the resistance with lowering of the temperature, as shown in Fig. 2d. 
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Figure 2 | Characterization of the field effect device and the resistance 
at different doping levels by gating. a, Sketch of the electric-field double 
layer transistor device. Vsp, source-drain voltage; Vg, top gate voltage; 
Vpq bottom gate voltage. b, Typical device characteristic under ion gel 
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For the K-T transition the resistivity is expected to scale with the coher- 
ence length as: 


&(T) ~aexp(—b/.T— Tx) (1) 


where Tx_7 is the K-T transition temperature, and a and bare material 
parameters. The experimental result reproduces this relation close to Tx-1; 
as shown in Fig. 3a for a charge-carrier density of n=2.67 x 10'*cm~*. 
We also observe current-voltage scaling’* in the superconductiv- 
ity phase (Vx I°) with a=5 for n=5.9 x 10'4 cm“ at the lowest 
temperature (Extended Data Fig. 3a). By fitting to equation (1), the 
K-T transition temperatures can be extracted for each doping level. In 
Fig. 3b we show the behaviour of Tx_y close to the QCP as a function of 
electron density. Quantum critical scaling" predicts Tx_px (n — 1)", 
where z is the dynamical exponent and vis the correlation length expo- 
nent. As shown in Fig. 3b and Extended Data Fig. 3b and c, we find 
zv 2/3. The same scaling was observed in other systems!*!7, and 
indicates that the superconductivity transition is of the classical three- 
dimensional XY or, equivalently, 2D quantum universality class'®’”. 
In the absence of a specific screening or dissipation mechanism'*, 
zis expected to be 1, so that y=2/3, which implies that our system is 
in the clean limit by the Harris criterion”. 

The temperature dependence of the sheet resistance can be written 
as Rs = Rgg + CT? (where Rgp is the residual resistance at 3 K), which 
decreases monotonically with charge-carrier density as shown in 
Fig. 3c, in accordance with the above conclusion regarding the Harris 
criterion. In an ordinary metal (or Fermi liquid) we expect a = 2, inde- 
pendent of doping. Nevertheless, in Fig. 3c we find 1 <a <2 (Extended 
Data Fig. 4a, b) over the entire phase diagram. Notice that at around 
n=7.5 X 10cm’, w=1.5 extends down to temperatures close to 
the superconductivity transition, which would seem to indicate the 
presence of another QCP, owing to suppression of the CDW, inside 
the superconductivity dome. In what follows we show that this is not 
the case. 
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top-gate operation at 285 K. Sheet resistance Rs (in units of kO per square) 
is shown on the left and electron density 5n2p, measured by the Hall effect 
at 285 K, on the right. c, d, At high and low temperatures, respectively, the 
CDW and superconductivity transitions can be clearly identified. 
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Figure 3 | Temperature dependence of the sheet resistance Rs close to 
the K-T transition. a, Fitting of the resistance to the K-T formula for a 
charge-carrier density of n= 2.67 x 10'* cm *. b, Behaviour of Tx_y at the 
QCP with critical exponent z+ 2/3. c, a values derived by fitting the 


The presence of the competing orders in this 2D system has striking 
consequences for the electronic transport. In Fig. 4a we show the mag- 
netoresistance as a function of magnetic field for a density of 
n=5.9 x 10'*cm~?. The magnetoresistance in the superconductivity 
phase is positive, as expected, but we clearly observe the presence of 
plateaus and oscillations in the data. By taking the derivative of the 
magnetoresistance, dR/dB, in Fig. 4b we observe that these features are 
temperature-independent and have well defined periods. The period- 
icity in magnetic field reflects a spatial periodicity given by the cyclotron 
equation, 0, = (By /8B)!/* , where By =h/(2e) = 2,068 T nm? is the flux 
quantum and 8B is the magnetic field periodicity. We have analysed the 
magnetoresistance data as a function of electron density (or gate 
voltage). (Magnetoresistance data for other electron densities are shown 
in Extended Data Fig. 5.) One can see a clear trend in the data (as shown 
in Fig. 4c): the length scale decreases monotonically with electron den- 
sity from ((n) +450 nm at n~1.3 x 10! cm~ to €(m) © 170nm at 
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It is clear that the well defined structure of the magnetoresistance 
reflects spatial fluctuations of the electronic pairing. A consistent expla- 
nation for these features is based on the Little-Parks effect”', whereby 
Cooper pairs are constrained to move in loops in the material—that 
is, pairing is local and constrained to well defined regions. The length 
scale we observe is associated with the trapping of magnetic flux quanta 
by the Cooper pairs. The existence of such a superconductivity matrix 
over a range of temperatures and charge-carrier densities in a single 
crystal is remarkable, leading us to suggest that the superconductivity 
matrix must be pinned and stabilized by an underlying matrix of inho- 
mogeneous electronic states. Fluctuations of an underlying charge or 
spin order parameter have led to the discovery of superconductivity in 
a wide range of systems; the suppression of the CDW transition from 
170K to 40K, concomitant with the appearance of superconductivity 


c 100 Aa a a 


7500 
80 + 4400 
I \ 
Ee 60 N 300 <> 
e.- | ‘ [ire 
a 40} J200 3 
| i) _ 
— 
20 <— 4100 
0 SS Se ee (0) 


0 1 2 3 4 5 6 7 
Charge-carrier density (1014 cm?) 


d 0.018; T 1 


—T T r  * T 
| Charge-carrier density (10'4 cm-’) T= 0.1 K] 

= 0.015, 1.2 2.1 Z| 
£2) 
§ 
> 0.012 + 
© 
© 0.009 
a 
= 
2 0.006} 
oO: 

0.003 

dimesill 1 nm 


1 1. = 41 + 1 4 
-1.2 -0.9 -0.6 -0.3 0.0 03 
V (mV) 


06 09 1.2 


oscillating period By and the corresponding length scale Dcpw. Error bars 
define the 90% confidence interval. d, Charge-carrier-density-dependent 
two-terminal conductance dI/dV shows the ZBCP, indicating non-s-wave 
superconducting pair symmetry. Charge-carrier density n =C x 10'¢cm~%, 
where C is indicated by the colour; see legend. 


14 JANUARY 2016 | VOL 529 | NATURE | 187 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


and non-Fermi-liquid behaviour, strongly indicates that the CDW 
plays a part. 

We now consider how local variations of the CDW can stabilize the 
superconductivity matrix. The CDW corresponds to a spatial modu- 
lation of the charge density §p = A(r)e~!@”)'"), where A is the CDW 
order parameter, r is the position in the 2D plane and Q is the CDW 
ordering vector. On the basis of symmetry alone, the Ginzburg-Landau 
free energy for A can be written as” 


F= J drfa(Ay +b(A) +¢(A)4 
1 (2) 
i 2m*Q? 


[|Q:(V —iQ)Af 4 r|Qx VAP] 


where a, b, c, m* and «& are phenomenological parameters that deter- 
mine the energy scale for the spatial variation of p. Variations in the 
amplitude A are energetically very costly because the CDW has to be 
locally destroyed. However, variations in the CDW phase are energet- 
ically allowed and can be expressed as 

| @) 


where Ao is the CDW order parameter in the uniform CCDW phase, 
K is a reciprocal lattice vector in the direction of Q, O(r) is a spatially 
varying phase and we have included the known CCDW wavevector 
for TiSe, (1/2, 1/2, 1/2). When 6=n (where 1 is an integer) we again 
have a CCDW, whereas if 6(r) = Q -r we have an ICDW. For illustrative 
purposes, we assume that the variation in @(x) is one-dimensional in 
nature and substitute equation (3) into equation (2) to obtain 


8F= f dx-{[2,6(«) — 1] —g[1 — cos 26(«))]} (4) 


A(r)= Avesp| |X r—0(7) 


where g depends on the Ginzburg-Landau parameters and 
x = |K/2 — Q|r is the dimensionless length scale. The last equation 
reflects how the free energy changes locally with the phase. Minimizing 
equation (4) with respect to 0 we obtain 

ao 

— = —2gsin(20 5 

3 gsin (20) (5) 


the differential equation for the pendulum (x plays the part of ‘time’), 
which has periodic solutions with period €jcpw ¥ T/ (gl! ?|K/2 —Q)). 
These spatially periodic variations of the phase represent Neél-like 
domain walls of [CDW between CCDW regions with different phase 
where ycpw is the thickness of the domain wall. McMillan described” 
how these defects allow a uniform ICDW with slowly varying phase to 
break apart into domains of CCDW separated by ICDW domain walls 
that have more rapidly varying phases. The domain wall density is x/Q 
to match the homogeneous ICDW state. 

A full solution to this problem in 2D is lacking, but we speculate 
that domain walls form a periodic matrix illustrated schematically in 
Fig. 1; blue CCDW regions with constant phases are embedded ina 
periodic ICDW matrix. We note that a similar structure was observed 
by scanning transmission microscope measurements of the closely 
related 1T-TaS,, in which the ICDW state exists at ambient condi- 
tions**. The self-organizing principle is that repulsive interactions 
occur between domain walls owing to higher-order terms in the free 
energy”. Therefore the ICDW domains will form a matrix, breaking 
the CCDW into domains with fixed area, as required by the Little- 
Parks effect. 

As shown by ref. 23, ICDW dynamic phase fluctuations—that 
is, phonon modes of the ICDW (not the lattice) —can exist in these 
domain walls. It is conceivable that these ICDW phonons induce 
superconductivity pairing and localize Cooper pairs in one-dimen- 
sional regions of the 2D system. Another intriguing aspect of our results 
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is displayed in Fig. 4d, which shows the point-contact conductance 
spectra measured at each density, in which we observe a clear zero bias 
conductance peak (ZBCP) in the superconductivity state. Extended 
Data Fig. 6 shows its temperature and magnetic field dependence at 
a density of n=2.1 x 10'4 cm~?. ZBCPs are observed in a wide range 
of unconventional superconductors and are understood to arise by 
Andreev reflection from a Cooper pairing potential having an internal 
phase shift of the superconductivity order parameter”’. These results 
are therefore in stark contrast to the experimentally determined single- 
gap s-wave superconductivity observed in the Cu-intercalated Cu,TiSe, 
(ref. 26). It is unlikely that our 2D samples would develop a super- 
conductivity order parameter that is qualitatively distinct from that 
of Cu,TiSe, (for example, d-wave). The existence of the ZBCP there- 
fore suggests that, together with the spatial modulation of the super- 
conductivity amplitude (which is demonstrated by the Little-Parks 
effect), there may also be a modulation of the superconductivity phase, 
although the correspondence between the amplitude variation and the 
phase variation cannot be determined from our measurements. 

The observed state in 1T-TiSe, bears some similarity to the pair 
density wave (PDW) superconducting CDW phases. However, fur- 
ther experiments are required to substantiate the PDW hypothesis. 
Although one-dimensional PDW states have attracted more attention 
within the context of the copper oxide superconductors”’, more general 
PDW states having phase and amplitude variations in 2D are expected 
to be possible”®. 

The coexistence of CCDW and ICDW was first observed by recent 
X-ray measurements of TiSe at pressures close to where the supercon- 
ductivity phase was expected”’. ICDW domain walls with a periodicity 
along the c axis of ~300 nm were observed, similar to the length scale 
determined in this experiment. While the periodicity was most pro- 
nounced along the c axis, a weak in-plane signal of incommensurability 
was observed that might correspond to the electronic microstructure 
observed here in the superconducting order (Abbamonte, P., personal 
communication, 15 January 2015). 

In summary, we studied samples of TiSe2 a few nanometres in thick- 
ness and tuned the material through the CDW and superconductiv- 
ity phases using the electric-field effect. This technique allowed us to 
study in great detail the QCP in the material and classify its universality 
class. We also identified the interplay between superconductivity and 
CDW through the formation of an inhomogeneous many-body state 
which we identify with the localization of Cooper pairs along a matrix 
of incommensurate dislocations surrounding regions of CCDW. We 
conjecture that the superconductivity has in its origin in the coupling 
of the McMillan phonon modes of the ICDW with the electrons. These 
results open up opportunities for electric-field tuning of many-body 
states in condensed matter research. 
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METHODS 


Crystal growth and quality verification. TiSe, single crystals were grown in 
two steps by the chemical vapour transport method. First, polycrystalline TiSe2 
was prepared by mixing high-purity titanium powder (from Alfa Aesar, 99.99%) 
and selenium powder (from Alfa Aesa, 99.999%) in a stoichiometric ratio and 
heating the mixture at 800 °C for 3 days in a vacuum-sealed (<10~° Torr) silica 
tube. Second, the polycrystalline powder was loaded into a two-zone tube furnace 
together with the transport agent I, at a concentration of 5mgcm *. The polycrys- 
talline powder was then heated to 670°C and single crystals of TiSe2 were collected 
at 600°C over a period of 10 days. 

The quality of the bulk single crystals was confirmed by X-ray diffraction 

and temperature-dependent Raman spectroscopy*!, as shown in Extended Data 
Fig. 1a and b. Further energy dispersive X-ray spectroscopy verified the stoichio- 
metric composition of the crystals. 
Device fabrication and characterization. TiSe was exfoliated in a pure argon 
atmosphere by Scotch tape’? onto a SiO (300 nm)/Si wafer and examined under 
high-resolution optical microscope. The non-uniformity in thickness can be 
discriminated by cross-correlation of the colour with atomic force microscope 
measurements of the height. Flakes with uniform thickness of around 10nm or 
less and a long bar shape were selected for the device fabrication. Electrodes for 
transport measurements were fabricated by standard electron beam lithography 
techniques using a polymethylmethacrylate (PMMA) positive resist, followed by 
deposition of Ti (10nm)/Au (65nm). Thin crystals (one to three layers) of com- 
mercial hexagonal boron nitride were transferred onto the nanosheets within the 
argon atmosphere***4; the role of hexagonal boron nitride is to protect the TiSe, 
from degradation by both oxidation and damage by the electrolyte gate. 

Atomic force microscope results show that the surface is clean (as shown in 
Extended Data Fig. 2b), with a roughness within +1 nm, which may result from 
the non-uniform thickness of the TiSe; flake. 

Electrical transport measurements were performed in both a “He cryostat and 
in a *He/*He dilution cryostat. Electrical transport measurements were performed 
using standard alternating-current (a.c.) lock-in amplifier and direct-current (d.c.) 
techniques, and resistance-versus-temperature and field measurements were per- 
formed using currents of 10-100 nA to avoid Joule heating. 

The ion gel solution was prepared by mixing the triblock copolymer polysty- 
rene-polymethylmethacrylate-polystyrene (PS-PMMA-PS) and the ionic liquid 
1-ethyl-3-methylimidazolium bis(trifluoromethylsulfonyl)imide (EMIM-TFSI) 
into an ethyl propionate solvent (the weight ratio of polymer to ionic liquid to 
solvent is 0.7:9.3:90)*°. After covering the device with ion gel droplets by drop 
casting, the device as shown in Extended Data Fig. 2a was loaded into the cryostat 
and kept at room temperature and high vacuum for one hour to remove residual 
water from the electrolyte. Afterwards, resistance was measured against gate volt- 
age to characterize the capability of the ion gel; a typical electrolyte gate sweep is 
shown in Fig. 1b. 

The charge-carrier density doping by the ionic gate can be derived from the 
Hall-effect measurement both at high (285 K) and low (3 K) temperature; the for- 
mer is shown in Extended Data Fig. 2c, and the latter is used to construct the phase 
diagram because the latter has a better direct correlation with the superconducting 
dome. Although the hexagonal boron nitride passivation prevents the accumula- 
tion of ions directly at the surface of TiSe2, this was not found to reduce the capacity 
of the gate much, as demonstrated by our results and those of a recent work*®, 
2D superconducting properties and the K-T transition. As discussed in the main 
text, the superconducting transition under different fixed perpendicular magnetic 
fields was measured. Extended Data Fig. 3b shows the magnetoresistance plot for 
a charge-carrier density of n=2.67 x 10‘ cm ~*. The upper critical field Hc2(T) 
values can be determined from each curve at the intercept of extrapolations from 
the normal state and the superconductivity state. Hc2(0) can be derived by inter- 
polating the plot curve to zero temperature, which gives a value of 450 mT. The 
superconducting coherence length therefore can be derived from €(0) = 
,{Hc2(0)/y . The minimum € that corresponds to the maximum Hc; point is 
about 22 nm, which is more than twice as large as the thickness of the measured 
device, indicating that superconductivity in our device is expected to have a 
2D character. 

For the K-T transition, the current-voltage response becomes nonlinear; a 
V x I” relation is expected as a result of the vortex—antivortex pair unbinding”’. 
Consistent behaviour is also observed in our sample, as shown in Extended Data 
Fig. 3a. 

To confirm the vz value that determines the nature of the quantum critical 
behaviour in our system, we also measured the temperature-dependent supercon- 
ducting transition under perpendicular magnetic field at a fixed charge-carrier 


density of n= 2.67 x 10'* cm? (see Extended Data Fig. 3b). By using finite size 
scaling with the formula Rg /Re = F((B— Bc) T~'/”), where Rc and Bc are two 
fitting parameters and F is an arbitrary function with F(0) = 1 (ref. 38), the data 
are expected to collapse into two sets of lines, with a certain vz value. As displayed 
in Extended Data Fig. 3c, the data collapse for vz 2/3, which confirms the previ- 
ous result. 

Magnetoresistance oscillations at other doping levels. The magnetoresistance 
oscillation is observed when we sweep a perpendicular magnetic field at different 
temperature in the superconducting state. From the QCP point n= 1.2 x 10/4 cm“? 
to the near-optimum doping n= 5.9 x 10! cm ~?, the oscillations can be observed 
for all doping levels. However, these oscillations can only be clearly observed for 
certain temperatures Ty and magnetic fields Bp, whereas Tp and By values increase 
with increasing doping. For instance, for n= 1.3 x 10!4 cm~?, Ty is 0.3 K and Bo 
is 0.06 T; for n=2.7 x 10'4 cm~, Tp increases to 0.4K and Bo increases to 0.13 T, 
as one can see from Extended Data Fig. 4. Although Tp and By values as well 
as the periods of oscillation 5B increase with doping level, the amplitude of the 
magnetoresistance oscillation does not monotonically depend on doping levels. 
We find that the oscillating amplitude for doping levels of 1.3 x 10'4 cm~* and 
5.9 x 10'4 cm is larger than that for other doping levels we measured. One can 
clearly see more contrast or sharpness for the periodic straight lines in Fig. 4b 
and Extended Data Fig. 5c than in Extended Data Fig. 5d. The stronger magneto- 
resistance oscillations at these doping levels could be related to the enhanced 
Cooper-pair phonon interaction, aroused by strong quantum fluctuation. 
Temperature dependence of the sheet resistance. We plot the temperature 
dependence of the sheet resistance between 3 K and 100 K with the doping level 
ranging from 4 x 10! cm? to 13 x 10!4 cm” as shown in Extended Data Fig. 5a 
and b. By taking the temperature derivative d(log(R — Ro))/d(log(T)), a is extracted 
at each doping as a function of the temperature. 

At doping levels away from optimal doping, 7.5 x 10! cm 2, we observe Fermi- 

liquid behaviour at low temperatures below Tcpw, At the optimal doping level an 
exponent of 3/2 is observed over a wide range of temperature; this exponent is 
similar to the one observed in MnSi®. As described in the main text, microscopic 
fluctuations of the order parameters from those of a CCDW to those of an ICDW 
gives rise to this temperature dependence. 
Point-contact conductance spectroscopy. Point-contact conductance spec- 
troscopy of the normal-superconducting junction between Au/Ti and TiSe, was 
performed by the two-terminal a.c. + d.c. method, whereby the d.c. voltage is 
modulated with an additional a.c. voltage, such that the derivative dI/dV can be 
measured at the first harmonic by a current preamplifier and standard lock-in 
amplifier techniques. 

The contacts were patterned by standard electron beam lithography using a 
PMMA positive resist. The development of the resist is performed in air to allow 
the oxidization of the contact region such that the contacts (despite not being 
nanoscale) are in the so-called ‘soft’ contact regime that has been successfully 
applied to pnictide and copper oxide superconductors”. In this regime spectro- 
scopic information can be obtained because the transport is primarily through 
multiple point-like pinholes whose individual dimension is smaller than the mean 
free path in the contact. 
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Extended Data Figure 2 | The Hall bar device and its characterization by Hall effect measurement. a, Optical microscope picture. b, Atomic force 
microscope picture of the Hall bar device. c, Temperature dependence of the charge-carrier density measured by the Hall effect at different top gate 
voltages, Vig. Scale bar, 5 um. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


a b c 
2.4|=———=T=0.03 K 420 1.02 ‘ ; weer 
24 360} 
1.00 
S18 300} 
2 S 240 
<a L [s) 
215 x & 0.98 
D4 2 180} a 
= Va |" Ry/ Ro = F{(B-B.)*T 
120} L—— 300 
4 0.96 F = 4 
ie 60 a ce 
0.6 a=5.0 | I 550 B= 453m 
1 1 fl 1 1 1 1 i 0.94 L____ 600 vz = 0.67 
02 03 04 05 06 07 08 O09 0.0 0.3 0.6 0.9 12 1.5 . 
log, ())(WA) T(K) 0.05 0.1 BEIT 0.2 0.25 0.30.35 
Cc 
Extended Data Figure 3 | Characterization of the K-T transition. transition at different fixed perpendicular magnetic fields for 
a, The current-voltage power-law fit for =5.9 x 10'4 cm” at different n=2.67 x 10’ cm ~*. c, The magnetoresistance data in b collapses 
temperatures is consistent with the behaviour of the 2D K-T transition. into two sets of lines by so-called finite size scaling. 


b, Temperature-dependent magnetoresistance of the superconducting 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


a 


charge-carrier density(10'*cm”) charge-carrier density (10"“cm”*) 

—4 — 88 20-4 ~~88 

—5.2 —10.1 — 5? oe 101 

—6.1 —11.7 —6.1 —11.7 

i. — 13.3 10 == 6.9 —— 13:3 

Vs awe 

0.0 
0 20 40 60 80 100 0.5 1.0 1.5 2.0 
T(K) log(T) 


Extended Data Figure 4 | The R versus T power-law fit indicates the existence of strong quantum fluctuation. a, Temperature dependence of the sheet 
resistance for different doping levels. b, The data shown in a is plotted on a log-log scale. 
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Extended Data Figure 5 | The magnetoresistance oscillation for charge-carrier densities of 1.3 x 10'* and 2.7 x 10’ cm”. a, c, Perpendicular 
magnetic-field-dependent magnetoresistance measured at different temperatures. b, d, Plots of dRs/dB against B and T for n=1.3 x 10'* cm? and 


n=2.7 x 10'* cm ~%, respectively. 
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Rapid removal of organic micropollutants from 
water by a porous 3-cyclodextrin polymer 


Alaaeddin Alsbaiee!, Brian J. Smith!, Leilei Xiao!, Yuhan Ling’, Damian E. Helbling? & William R. Dichtel! 


The global occurrence in water resources of organic 
micropollutants, such as pesticides and pharmaceuticals, has 
raised concerns about potential negative effects on aquatic 
ecosystems and human health!~*. Activated carbons are the most 
widespread adsorbent materials used to remove organic pollutants 
from water but they have several deficiencies, including slow 
pollutant uptake (of the order of hours)®’ and poor removal of 
many relatively hydrophilic micropollutants®. Furthermore, 
regenerating spent activated carbon is energy intensive (requiring 
heating to 500-900 degrees Celsius) and does not fully restore 
performance”. Insoluble polymers of B-cyclodextrin, an 
inexpensive, sustainably produced macrocycle of glucose, are 
likewise of interest for removing micropollutants from water by 
means of adsorption’!. 8-cyclodextrin is known to encapsulate 
pollutants to form well-defined host-guest complexes, but until 
now cross-linked (-cyclodextrin polymers have had low surface 
areas and poor removal performance compared to conventional 
activated carbons!''°. Here we crosslink B-cyclodextrin with 
rigid aromatic groups, providing a high-surface-area, mesoporous 
polymer of 8-cyclodextrin. It rapidly sequesters a variety of 
organic micropollutants with adsorption rate constants 15 to 200 
times greater than those of activated carbons and non-porous 
3-cyclodextrin adsorbent materials”*!!-°, In addition, the 
polymer can be regenerated several times using a mild washing 
procedure with no loss in performance. Finally, the polymer 
outperformed a leading activated carbon for the rapid removal of 
a complex mixture of organic micropollutants at environmentally 
relevant concentrations. These findings demonstrate the promise 
of porous cyclodextrin-based polymers for rapid, flow-through 
water treatment. 

Porous 8-CD-containing polymers (P-CDPs) were derived from 
nucleophilic aromatic substitution of hydroxyl groups of 8-CD by 
tetrafluoroterephthalonitrile (1). Although 1 has been copolymer- 
ized with bifunctional catechols previously", its reaction with ali- 
phatic alkoxides is undescribed. 8-CD and 1 were polymerized in a 
suspension of K,COs; in tetrahydrofuran (THF) at 80°C to provide 
a pale-yellow precipitate in 20% yield, which proved to be a meso- 
porous high-surface-area polymer with the expected chemical bonds 
(Fig. 1a). The yield was further improved to 45% by performing the 
polymerization in THF:DMF (dimethylformamide) 9:1 by volume, 
in which B-CD is more soluble (see Supplementary Information sec- 
tion ‘Improved synthesis of P-CDP’ and Supplementary Figs 1 and 2). 
Following activation under high vacuum, N2 porosimetry of the 
P-CDPs provided type I] isotherms indicative of mesoporosity, and 
their Brunauer-Emmett-Teller surface areas (Spey) ranged from 35 
to 263 m*g~', depending on the molar feed ratio of 1:3-CD employed 
in the polymerization (Fig. 1b). P-CDPs obtained from a 1:8-CD ratio 
of 3:1 consistently exhibited the highest surface areas. Non-local den- 
sity functional theory (NLDFT) calculations applied to the isotherms 
indicate that pores of 1.8-3.5nm diameter comprise the majority of 


the free volume of P-CDP (Fig. 1c), much like the pore size distribu- 
tions of activated carbons (ACs; Extended Data Fig. 1). Alternative 
polymerization conditions (aqueous NaOH, 60°C) produced a simi- 
lar polymer that lacked permanent porosity (non-porous (NP)-CDP, 
Sprr=6m7*g_!), which serves as a useful control to demonstrate the 
importance of surface area for rapid micropollutant removal (see 
Methods section ‘Synthetic procedures’). “Water regain analysis of 
P-CDP and NP-CDP also reflected the higher pore volume of the 
former, which took up 265% of its weight when dispersed in H2O as 
compared to 86% for the latter (Extended Data Table 1). Nevertheless, 
NP-CDP swells to a much greater degree, as its HO uptake is approx- 
imately 300 times its dry pore volume, as compared to a factor of only 
23 for P-CDP. P-CDP’s combination of high H,O uptake and modest 
swelling are desirable, as these parameters maximize adsorbent per- 
formance and minimize undesirable pressure drops associated with 
filtration processes. 

Compositional analysis and spectroscopic characterization of 
the P-CDP and NP-CDP networks indicated the presence of both 
1 and 8-CD moieties in the polymers (see Methods section ‘FT-IR 
and solid-state }C NMR characterization of P-CDP and NP-CDP’ 
Extended Data Figs 2 and 3). The ratio of 1:8-CD in each polymer was 
determined by combustion analysis, with 6.1 equiv. of 1 per 8-CD for 
P-CDP and 3.5 equiv. of 1 per 8-CD for NP-CDP (see Methods section 
‘Synthetic procedures’). Therefore P-CDP is more densely crosslinked 
than NP-CDP, which might be responsible for its permanent porosity. 
The F:N ratio also indicates that the terephthalonitrile moieties in each 
polymer are substituted by 2.1 and 2.2 alkoxides on average, which 
are consistent with model studies (see model reactions $1-S4 and 
Supplementary Figs 3-7), which suggest that the B-CD macrocycles 
are linked predominantly at both the small and large rims through 
disubstituted terephthalonitrile moieties. 

The high surface area and permanent porosity of P-CDP enable 
the rapid removal of organic contaminants from water. Bisphenol A, 
a component of plastics that has attracted concerns as an endocrine 
disruptor, was chosen as a model pollutant to enable comparison 
with established adsorbents!>!®, We compared uptake of bisphenol 
A by P-CDP, NP-CDP, and a non-porous 3-CD polymer crosslinked 
with epichlorohydrin (EPI-CDP, Sper = 23 mm g'), which is the most 
extensively studied 8-CD polymer for water purification and has been 
commercialized'”. We also tested three types of mesoporous ACs: the 
hybrid AC/ion exchange resin used in commercial Brita point-of- 
use filters (Brita AC, Sppp=507 m? g'), DARCO granular activated 
carbon (GAC, Sper=612 m? g), and Norit RO 0.8 activated carbon 
(NAC, Sprr=984m7¢g~'), which is a leading AC typically used for 
high-value water purification (Extended Data Fig. 1). Each adsor- 
bent (1 mgmlI~') eventually removed most of the bisphenol A from a 
0.1 mM (22.8 mg1~') aqueous solution, corresponding to equilibrium 
uptakes of 19-24 mg bisphenol A per g adsorbent (Extended Data 
Table 2, Extended Data Fig. 4), with P-CDP near the high end of this 
range (22 mgg_'). More importantly, P-CDP removed bisphenol A 
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Figure 1 | 8-CD polymer networks derived from nucleophilic aromatic 
substitution reactions. a, Left, synthesis of the high-surface-area porous 

P-CDP from 8-CD and 1. Right, schematic of the P-CDP structure. b, 

N> adsorption (blue squares) and desorption (grey squares) isotherms of 

P-CDP. The solid line is a guide to the eye. Sper is the Brunauer-Emmett- 


much more quickly than all other adsorbents, reaching ~95% of its 
equilibrium uptake in 10s (Fig. 2a). In contrast, NP-CDP required 
30 min to reach equilibrium and adsorbed only 46% of its equilib- 
rium value in 10s, indicating that the near-instantaneous adsorption 
of bisphenol A by P-CDP is attributable to its porosity. Likewise, EPI- 
CDP required more than 1h to reach equilibrium and only adsorbed 
22% of its equilibrium value after 10 s, which is consistent with previ- 
ous reports’’. Finally, Brita AC and GAC each required more than 1h 
to reach equilibrium, while NAC required 10 min (Fig. 2a). NAC only 
adsorbed 53% of its equilibrium value in 10s despite its nearly four 
times higher surface area than P-CDP. 

The apparent pseudo-second-order rate constant (Kops) of bisphe- 
nol A adsorption to P-CDP is 1.5mgg~! min™!, which is 15 times 
higher than the high-performance NAC and two or more orders of 
magnitude higher than the other studied adsorbents (Extended Data 
Table 2, Extended Data Fig. 4). To our knowledge, this rate constant is 
the highest reported for bisphenol A or any other pollutant removed 
by ACs®’, mesoporous silicas'* or carbohydrate-based adsorbents’?! 
under similar experimental conditions. k,,,; amalgamates the 


r\) 
x 


Pore width (nm) 


Teller (BET) surface area (in units of m* g~!) of P-CDP calculated from the 
N2 adsorption isotherm, and P and Pp are the equilibrium and saturation 
pressures of N> at 77 K, respectively. c, The cumulative pore volume of 
P-CDP obtained by NLDFT analysis indicates the polymer’s mesoporous 
structure. 


performance of readily accessible binding sites (conceptualized as 
the outer surface of the adsorbent) and less accessible binding sites 
(conceptualized as being within the adsorbent’s interior). P-CDP’s 
superior kop; for bisphenol A adsorption indicates that nearly all of its 
B-CD binding sites are readily accessible, a feature not found in other 
adsorbents of which we are aware. We further probed the readily acces- 
sible binding sites of each adsorbent by determining the flow-through 
uptake of bisphenol A. In these experiments, the adsorbent (~3 mg) 
was trapped as a thin layer on a 0.2 um syringe filter, and aqueous 
bisphenol A (3 ml, 0.1 mM) was passed rapidly through the filter at a 
flow rate of 9mlmin~!. Under these conditions, P-CDP removed 80% 
of the bisphenol A from the solution, corresponding to more than 85% 
of its equilibrium uptake (Fig. 2b), whereas NAC removed 59% of the 
bisphenol A under the same conditions, indicative that nearly half of 
its binding sites are not accessible on the 20s timescale. The superior 
performance of P-CDP further indicates that most of its B-CD moieties 
are rapidly accessed by bisphenol A. 

The thermodynamic parameters of P-CDP’s bisphenol A adsorp- 
tion are consistent with the formation of B-CD inclusion complexes. 
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Figure 2 | Rate of bisphenol A uptake by various adsorbents. 

a, Time-dependent adsorption of aqueous bisphenol A (0.1 mM) by 
each adsorbent (1 mg ml‘; see main text for adsorbent details). 

b, Removal of bisphenol A upon rapid flowing of the solution through 
a thin layer of the adsorbent. The data are reported as the average of 
triplicate experiments. Error bars, minimum and maximum removal. 


c, The average percentage bisphenol A (BPA) removal efficiency by 
P-CDP after consecutive regeneration cycles. P-CDP was regenerated by 
rinsing the spent adsorbent with MeOH at room temperature. The data 
are reported as the average of triplicate experiments. Error bars, minimum 
and maximum removal. 
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P-CDP’s equilibrium uptake as a function of residual bisphenol A 
concentration after adsorption, [BPA], fitted the Langmuir model 
(Methods section “Thermodynamic studies of adsorption, Extended 
Data Fig. 5), suggesting 1:1 inclusion complex formation with an 
association constant (K) of 56,000 M71, which is comparable to the 
values reported for other 8-CD polymers”. Furthermore, the max- 
imum adsorption capacity at equilibrium (qmax,.) was found to be 
88 mg gh which is similar to the highest reported value of an EPI- 
CDP (84mgg7')!!, and corresponds to a bisphenol A:3-CD molar 
ratio of 0.9. Therefore, most of the B-CD units in the polymer are able 
to form 1:1 complexes with bisphenol A at equilibrium. At even higher 
concentrations of bisphenol A, P-CDP achieves bisphenol A:3-CD 
ratios greater than 1, presumably by binding bisphenol A on the out- 
side of the CD rings or through other non-specific interactions. For 
example, 1 mg ml! of P-CDP adsorbs 200 mgg~' of bisphenol A from 
a 1mM aqueous solution, indicating significant capacity beyond 1:1 
CD inclusion complexes. However, these results suggest that P-CDP’s 
binding properties will reflect those of 8-CD inclusion complex for- 
mation at concentrations relevant for water purification. 

In contrast to the energy intensive and degradative regeneration pro- 
cesses of ACs, bisphenol A is easily removed from P-CDP by rinsing 
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Figure 3 | Compound P-CDP rapidly adsorbs a broad range of organic 
micropollutants. a, Structures and relevance of each tested emerging 
organic micropollutant. Physicochemical properties most relevant to 
adsorption processes for each compound are available in Supplementary 
Table 2. b, Time-dependent adsorption of each pollutant (0.1 mM) by 
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the polymer with MeOH at room temperature. Five consecutive 
bisphenol A adsorption/desorption cycles were performed and the 
regenerated P-CDP exhibited almost no decrease in performance 
compared to the as-synthesized polymer (Fig. 2c). A functional cost 
analysis of P-CDP indicates raw materials costs of US dollar (USD) 
3.70 per kg assuming further optimization of the polymerization, 
corresponding to manufacturing cost estimates of USD 5-25 per kg 
(see Supplementary Information section ‘Functional cost analysis of 
P-CDP’). These estimates show promise compared to the costs (more 
than USD 9 per kg) of advanced ACs used for water treatment and 
wholesale costs of GAC (USD 22 per kg) and NAC (USD 47 per kg) 
used in this study. P-CDP’s superior performance, facile regeneration 
procedure, and practical estimated cost make it reasonable to expect 
that it will prove economically competitive with ACs when full life- 
cycle analyses are performed. 

In addition to bisphenol A, we evaluated the ability of P-CDP to 
remove pollutants of different size, functionality and hydropho- 
bicity that span simple aromatics, pharmaceuticals and pesticides 
(Fig. 3a, Supplementary Table 1). The simple aromatics included the 
following: 2,4-dichlorophenol, an intermediate in herbicide produc- 
tion and degradation product of the antibacterial agent triclosan™ 
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P-CDP (1 mg ml‘). c, Percentage removal efficiency of each pollutant 
obtained by rapidly flowing the adsorbate solution through a thin layer 
of P-CDP (blue), NAC (red) or EPI-CDP (green). The data are reported 
as the average uptake of triplicate experiments. Error bars, minimum and 
maximum uptake. 
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Figure 4 | P-CDP outperforms NAC for the rapid removal of a complex 
mixture of pollutants at environmentally relevant concentrations. 
Percentage removal efficiency of a pollutant mixture obtained by rapidly 
flowing the adsorbate solution (8 ml) through a thin layer (0.3 mg) of 
P-CDP (blue) or NAC (red). Individual pollutant concentrations (in units 
of jg 1~') in the mixture were: 100 (BPA); 2.5 (BPS); 5 (metolachlor); 

100 (propranolol); 50 (ethinyl oestradiol); 5 (1-NA); 25 (2-NO); and 

2.5 (2,4-DCP). Data are reported as the average uptake of three 
independent experiments. Error bars, minimum and maximum uptake. 


1-naphthyl amine, an azo dye precursor and known carcinogen”; 
and 2-naphthol, a model for various naphthol pollutants. We also 
evaluated the following anthropogenic contaminants: bisphenol S, 
which has replaced bisphenol A in many polycarbonates but also 
appears to be an endocrine disruptor with greater environmental 
persistence”; metolachlor, one of the most common herbicides that 
is often detected in streams and groundwater”; ethinyl oestradiol, 
an oestrogen mimic used in oral contraceptives that has caused 
the collapse of fish populations at concentrations as low as 5 ng 1“! 
(ref. 27); and propranolol, a beta-blocker used to treat hypertension, 
which is not removed efficiently by wastewater treatment protocols 
and has been detected in effluent streams at concentrations similar to 
blood serum levels of its users”®. Adsorption studies of each of these 
compounds were performed similarly to those for bisphenol A (0.1 mM 
adsorbate, 1 mg ml! adsorbent), except ethinyl oestradiol, which was 
tested at a lower concentration because of its low water solubility 
(0.04 mM adsorbate, 0.5mg ml! adsorbent). Each organic contami- 
nant is rapidly removed by P-CDP (Fig. 3b, Extended Data Fig. 6), and 
the time-dependent adsorption curves are similar to that of bisphenol A. 
The binding constants of the tested pollutants were estimated from the 
binding efficiency at equilibrium, and all were approximately 10*M~! 
(Extended Data Tables 3 and 4). The rapid uptake of these pollutants 
by P-CDP was also investigated and compared with EPI-CDP and high 
performance NAC (Fig. 3c). P-CDP shows excellent rapid uptake of 
all pollutants, in stark contrast to the non-porous EPI-CDP, and it also 
outperforms NAC for all of the studied emerging contaminants. P-CDP 
even shows similar performance to NAC for the planar aromatic model 
compounds, which interact strongly with ACs. P-CDP’s superior perfor- 
mance for anthropogenic contaminants indicates a major advantage of 
3-CD-based adsorbents: their three-dimensional cavities are a better 
match for non-planar compounds. 

The rapid uptake of these pollutants was also investigated at envi- 
ronmentally relevant concentrations and in a mixture solution at con- 
centrations between 2.5 and 100,g1~1, the range in which many polar 
organic pollutants are quantified in wastewater” and drinking water 
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resources’’. The aqueous mixture of pollutants (8 ml) was passed rap- 
idly through a 0.2 1m syringe filter containing approximately 0.3 mg 
of P-CDP or NAC (Fig. 4). On average, all of these emerging contami- 
nants again showed equal or greater rapid uptake by P-CDP over NAC. 
Two pollutants showed no rapid uptake by NAC at low concentrations 
whereas all eight pollutants showed at least some removal by P-CDP. 
These results demonstrate that P-CDP can at least partially remove 
polar organic pollutants at environmental concentrations rapidly and 
simultaneously when present in mixtures, suggesting that it can con- 
tribute to the removal of a wide range of micropollutants during water 
and wastewater treatment. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 


Received 4 July; accepted 26 October 2015. 
Published online 21 December 2015. 


1. Schwarzenbach, R. P. et al. The challenge of micropollutants in aquatic 
systems. Science 313, 1072-1077 (2006). 

2. Richardson, S. D. & Ternes, T. A. Water analysis: emerging contaminants and 
current issues. Anal. Chem. 86, 2813-2848 (2014). 

3. Murray, K. E., Thomas, S. M. & Bodour, A. A. Prioritizing research for trace 
pollutants and emerging contaminants in the freshwater environment. 
Environ. Pollut. 158, 3462-3471 (2010). 

4. McKinlay, R., Plant, J. A. Bell, J. N. B. & Voulvoulis, N. Endocrine disrupting 
pesticides: implications for risk assessment. Environ. Int. 34, 168-183 (2008). 

5. Daughton, C. G. & Ternes, T. A. Pharmaceuticals and personal care products in 
the environment: agents of subtle change? Environ. Health Perspect. 107, 
907-938 (1999). 

6.  Orfao, J. J. M. et al. Adsorption of a reactive dye on chemically modified 
activated carbons — influence of pH. J. Colloid Interf. Sci. 296, 480-489 (2006). 

7. Putra, E. K., Pranowo, R., Sunarso, J., Indraswati, N. & Ismadji, S. Performance 
of activated carbon and bentonite for adsorption of amoxicillin from 
wastewater: mechanisms, isotherms and kinetics. Water Res. 43, 2419-2430 
(2009). 

8. Kovalova, L., Knappe, D. R. U., Lehnberg, K., Kazner, C. & Hollender, J. Removal 
of highly polar micropollutants from wastewater by powdered activated 
carbon. Environ. Sci. Pollut. Res. 20, 3607-3615 (2013). 

9. Chiang, P. C., Chang, E. E. & Wu, J. S. Comparison of chemical and thermal 
regeneration of aromatic compounds on exhausted activated carbon. 

Water Sci. Technol. 35, 279-285 (1997). 

10. San Miguel, G., Lambert, S. D. & Graham, N. J. D. The regeneration of 
field-spent granular-activated carbons. Water Res. 35, 2740-2748 (2001). 

11. Morin-Crini, N. & Crini, G. Environmental applications of water-insoluble 
8-cyclodextrin-epichlorohydrin polymers. Prog. Polym. Sci. 38, 344-368 
(2013). 

12. Lo Meo, P, Lazzara, G., Liotta, L. Riela, S. & Noto, R. Cyclodextrin-calixarene 
co-polymers as a new class of nanosponges. Polym. Chem. 5, 4499-4510 
(2014). 

13. Crini, G. & Morcellet, M. Synthesis and applications of adsorbents containing 
cyclodextrins. J. Sep. Sci. 25, 789-813 (2002). 

14. Budd, P. M. et al. Polymers of intrinsic microporosity (PIMs): robust, 
solution-processable, organic nanoporous materials. Chem. Commun. 230-231 
(2004). 

15. Vandenberg, L. N., Hauser, R., Marcus, M., Olea, N. & Welshons, W. V. 
Human exposure to bisphenol A (BPA). Reprod. Toxicol. 24, 139-177 
(2007). 

16. Liang, L. et al. Occurrence of bisphenol A in surface and drinking waters and its 
physicochemical removal technologies. Front. Environ. Sci. Eng. 9, 16-38 
(2015). 

17. Kitaoka, M. & Hayashi, K. Adsorption of bisphenol A by cross-linked 
8-cyclodextrin polymer. J. Incl. Phenom. Macrocycl. Chem. 44, 429-431 
(2002). 

18. Kim, Y.-H., Lee, B., Choo, K.-H. & Choi, S.-J. Selective adsorption of bisphenol A 
by organic-inorganic hybrid mesoporous silicas. Micropor. Mesopor. Mater. 
138, 184-190 (2011). 

19. Kyzas, G. Z., Lazaridis, N. K. & Bikiaris, D. N. Optimization of chitosan and 
8-cyclodextrin molecularly imprinted polymer synthesis for dye adsorption. 
Carbohydr. Polym. 91, 198-208 (2013). 

20. Zhou, L-C. et al. Highly efficient adsorption of chlorophenols onto chemically 
modified chitosan. Appl. Surf. Sci. 292, 735-741 (2014). 

21. Wan Ngah, W. S., Teong, L. C. & Hanafiah, M. A. K. M. Adsorption of dyes and 
heavy metal ions by chitosan composites: a review. Carbohydr. Polym. 83, 
1446-1456 (2011). 

22. Aoki, N., Nishikawa, M. & Hattori, K. Synthesis of chitosan derivatives bearing 
cyclodextrin and adsorption of p-nonylphenol and bisphenol A. Carbohydr. 
Polym. 52, 219-223 (2003). 

23. Latch, D. E. et al. Aqueous photochemistry of triclosan: formation of 
2,4-dichlorophenol, 2,8-dichlorodibenzo-p-dioxin, and oligomerization 
products. Environ. Toxicol. Chem. 24, 517-525 (2005). 


14 JANUARY 2016 | VOL 529 | NATURE | 193 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


24. Occupational Safety and Health Administration (OSHA) Standard, USA. Toxic 
and Hazardous Substances: 13 Carcinogens (4-Nitrobiphenyl, etc.). Standard 
number 1910.1003. http://www.osha.gov/pls/oshaweb/owadisp.show_ 
document?p_table=STANDARDS&p_id=10007 (2012). 

25. Ike, M., Chen, M. Y., Danzl, E., Sei, K. & Fujita, M. Biodegradation of a variety of 
bisphenols under aerobic and anaerobic conditions. Water Sci. Technol. 53, 
153-159 (2006). 

26. Benner, J. et al. Is biological treatment a viable alternative for micropollutant 
removal in drinking water treatment processes? Water Res. 47, 5955-5976 
(2013). 

27. Kidd, K. A. et al. Collapse of a fish population after exposure to a synthetic 
estrogen. Proc. Natl Acad. Sci. USA 104, 8897-8901 (2007). 

28. Kostich, M. S., Batt, A. L. & Lazorchak, J. M. Concentrations of prioritized 
pharmaceuticals in effluents from 50 large wastewater treatment plants in 
the US and implications for risk estimation. Environ. Pollut. 184, 354-359 
(2014). 

29. Oulton, R. L. Kohn, T. & Cwiertny, D. M. Pharmaceuticals and personal care 
products in effluent matrices: a survey of transformation and removal during 
wastewater treatment and implications for wastewater management. J. Environ. 
Monit. 12, 1956-1978 (2010). 


194 | NATURE | VOL 529 | 14 JANUARY 2016 


Supplementary Information is available in the online version of the paper. 


Acknowledgements This work was supported by the National Science 
Foundation (NSF) through the Center for Sustainable Polymers 
(CHE-1413862). This research made use of the Cornell Center for Materials 
Research User Facilities, which are supported by the NSF (DMR-1120296). We 
acknowledge |. Keresztes for help with NMR spectroscopy, and M. Matsumoto 
for the design of the schematic of the polymer in Fig. 1a. 


Author Contributions A.A., B.J.S., and L.X., and W.R.D. designed, synthesized, 
and characterized the cyclodextrin polymers and their micropollutant uptake at 
high concentrations. Y.L. and D.E.H. designed and conducted experiments that 
quantified micropollutant uptake at low concentrations. All authors wrote the 
manuscript. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare competing financial interests: 
details are available in the online version of the paper. Readers are welcome to 
comment on the online version of the paper. Correspondence and requests 
for materials should be addressed to W.R.D. (wdichtel@cornell.edu) or D.E.H. 
(damian.helbling@cornell.edu) . 


© 2016 Macmillan Publishers Limited. All rights reserved 


METHODS 


A complete set of detailed experiments, spectral data and adsorption experiments 
are available in Supplementary Information. 

Reagents. 3-cyclodextrin (3-CD) (>97%) and tetrafluoroterephthalonitrile (1) 
(>99%) were purchased from Sigma Aldrich and used without further purification. 
Tetrahydrofuran (THF) was purified and dried in a custom-built activated alumina 
solvent purification system. Epichlorohydrin (>99%) was purchased from Sigma 
Aldrich and used as received. Aqueous solutions of pollutants were prepared using 
18 MQ deionized H;O at neutral pH. Pollutant model compounds were obtained 
from commercial sources and used as received. Norit RO 0.8 activated carbon 
(NAC) pellets were purchased from Sigma Aldrich and ground into a fine powder 
before use. Brita AC was obtained from a Brita Advanced Faucet point-of-use 
water filter and was ground into fine powder before use. Granular activated carbon 
(GAC; DARCO 12-20 mesh) was purchased from Sigma Aldrich, and ground into 
fine powder before use. 

Materials and instrumentation. Pollutant removal experiments were performed 
at 25°C on a stirring hot plate with a 2501.p.m. stirring rate. Aqueous suspen- 
sions of adsorption experiments were filtered in syringes equipped with Whatman 
0.2 41m inorganic membrane filters. Instant pollutant removal experiments were 
also performed in syringes equipped with Whatman 0.2 |1m inorganic membrane 
filters. 

Ultraviolet-visible (UV-vis) spectroscopy was performed on a Cary 5000 Varian 
UV-vis spectrometer. UV-vis spectra were recorded at RT over the range 200- 
600 nm, corrected against an appropriate background spectrum, and normalized 
to zero absorbance at 600 nm. 

Quantification of analytes from the uptake of pollutant mixtures at j1g1~! concen- 
trations was performed by mass spectrometry (HPLC-MS). The analytical method 
was adopted from one previously reported for ultratrace level screening of polar 
and semi-polar organic chemicals* and involved high-performance liquid chro- 
matography (HPLC) coupled with a quadrupole-orbitrap mass spectrometer (MS) 
(QExactive, ThermoFisher Scientific) and on-line solid phase extraction (EQuan 
Max Plus, ThermoFisher Scientific). Samples were injected at 5 ml volumes and 
were loaded onto an XBridge (Waters) C-18 Intelligent Speed (2.1mm x 20mm, 
particle size 5,1m) trap column. Elution from the trap column and onto an XBridge 
(Waters) C-18 analytical column (2.1mm x 50mm, particle size 3.5,1m) was per- 
formed using a gradient pump delivering 200,1min“' of a water and MeOH mobile 
phase, each containing 0.1 vol.% formic acid. The HPLC-MS was operated with 
electrospray ionization in positive and negative polarity modes. The MS acquired 
full-scan MS data within a mass-to-charge range of 100-1,000 for each sample fol- 
lowed by a data-dependent acquisition of product ion spectra (MS/MS). Analytes 
were quantified from external calibration standards based on the analyte responses 
by linear least-squares regression. Limits of quantification for each analyte were 
determined as the lowest point in the external calibration curve at which at least 
8 scans were measured across a chromatographic peak and the most intense MS/ 
MS product ion was still detected. Exact molecular masses, ionization behaviour, 
retention times, and limits of quantification used for the detection and quantifica- 
tion of each analyte are provided in Supplementary Table 2. 

Infrared spectroscopy was performed on a Thermo Nicolet iS10 with a diamond 
ATR attachment. Solution-phase NMR experiments were performed on a Varian 
INOVA-400 using a standard 'H{}8C, 5N} Z-PFG probe with a 20 Hz sample spin 
rate. Solid-state NMR analyses were conducted on a Varian INOVA-400 spectrom- 
eter using an external Kalmus 'H linear pulse amplifier blanked using a spare line. 
Samples were packed into 7mm outside diameter silicon nitride rotors and inserted 
into a Varian HX magic angle spinning (MAS) probe. 

Surface area measurements were conducted on a Micromeritics ASAP 2020 
Accelerated Surface Area and Porosimetry Analyzer. Each sample (25-50 mg) was 
degassed at 90°C for 24h and then backfilled with Nj. N2 isotherms were generated 
by incremental exposure to ultrahigh-purity nitrogen up to 1 atm in a liquid nitro- 
gen (77 K) bath, and surface parameters were determined using BET adsorption 
models included in the instrument software (Micromeritics ASAP 2020 V4.00). 
Synthetic procedures. Porous (3-cyclodextrin polymer (P-CDP). A flame dried 
20 ml scintillation vial equipped with a magnetic stir bar was charged with 8-CD 
(0.200 g, 0.176 mmol), 1 (0.100 g, 0.500 mmol), and K;CO3 (0.300 g, 2.17 mmol). 
The vial was flushed with N> gas for 5 min, then dry THF (8 ml) was added and the 
vial was bubbled with N; for additional 2-3 min. The N> inlet was removed and the 
mixture was placed on a hot stirring plate (85°C) and stirred at 500 r.p.m. for 2 d. 
The orange suspension was cooled and then filtered, and the residual KyCO3 was 
removed by washing the solid on the filter paper with 1 N HCl until CO; evolution 
stopped. The recovered light yellow solid was isolated and activated by soaking in 
H,0 (2 x 10 ml) for 15min, THF (2 x 10 ml) for 30 min and CH)Cl, (1 x 15 ml) 
for 15min. Finally, the solid was dried under high vacuum at 77K in a liquid 
nitrogen bath for 10 min and then at RT for 2-3 days. P-CDP (0.055 g, 20% yield) 
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was obtained as a pale yellow powder and subsequently characterized. *C-MAS 
SS-NMR (400 MHz): chemical shifts 6 of 168.9 p.p.m., 157.2 p.p.m., 131.1 p.p.m., 
103.9 p.p.m., 95.2 p.p.m. and 71.8 p.p.m. relative to an external reference (the deu- 
terium signal of liquid CDC]s). IR (solid, attenuated total reflectance, ATR) 3,368, 
2,937, 2,243, 1,684, 1,625, 1,478, 1,376, 1,304, 1,270, 1,153, 1,030cm7!. Analysis. 
Calculated for (C42H79O35)1¢(CgF2N2)6.1¢(CH2Cl2)2°(H20)): CG 47.95; H, BAZ; KE 
9.97; N, 7.35. Found: C, 48.23; H, 2.99; F, 9.66; N, 7.37. 

Non-porous 33-cyclodextrin polymer (NP-CDP). 3-CD (2.00 g, 1.76 mmol) and 1 
(2.11g, 10.6 mmol) were mixed vigorously in an aqueous NaOH solution (6.25 N, 
2.00 ml) at 85°C. The mixture solidified within 1h, after which deionized H,O 
was added and the suspension was filtered. The solid was washed by soaking in 
deionized HO (2 x 150 ml) for 15 min, THF (3 x 15 ml) for 30 min and CH2Cl, 
(1 x 15 ml) for 15 min. Finally, the solid was dried under high vacuum at RT for 
2 days to give NP-CDP (0.746 g, 20.1% yield) as a yellow powder. *C-MAS 
SS-NMR (400 MHz): 6 162.9, 143.3, 140.4, 135.1, 117.0, 99.0, 96.2, 94.1, 72.6 p.p.m. 
IR (solid, ATR) 3,327, 2,938, 2,239, 1,674, 1,610, 1,463, 1,370, 1,268, 1,150, 1,100, 
1,030 cm"), Analysis. Calculated for (C42H 63035) 1¢(CgF.gN2)3.5¢(H20))3: G 43.88; 
H, 4.68; F, 6.25; N, 5.12. Found: C, 43.78; H, 4.51; F, 6.31; N, 5.11. 

Epichlorohydrin (3-cyclodextrin polymer (EPI-CDP). 8-CD (0.300, 2.64 mmol) was 
dissolved in aqueous NaOH (6.25 N, 5.00 ml) at 60°C. Epichlorohydrin (2.50 ml, 
32.4mmol) was added to this solution dropwise while stirring vigorously at 60°C. 
The mixture turned into a yellow gel within 1h, after which 10 ml of deionized 
H2O was added, and the mixture was filtered on a Biichner funnel. The solid was 
washed by soaking in deionized H2O (2 x 150 ml) for 15 min, THF (3 x 15 ml) 
for 30min and CH2Cl, (1 x 15 ml) for 15 min. The solid was finally dried under 
high vacuum for 2 d at RT to give EPI-CDP (3.11 g, 62% yield) as a white powder. 
13C-MAS SS-NMR (400 MHz): 6 100.1, 72.0 p.p.m. IR (solid, ATR) 3,387, 2,923, 
2,900, 1,702, 1,360, 1,030cm7!. Analysis. Calculated for (C4gH¢0035)1¢(C3H6O) 10° 
(H2O)a.5: C, 48.40; H, 7.28. Found: C, 48.23; H, 7.09. 

FT-IR and solid-state *C NMR characterization of P-CDP and NP-CDP. FTIR 
spectra of P-CDP and NP-CDP showed absorbances at 2,235 cm |, correspond- 
ing to the nitrile stretch, as well as 1,670 cm! and 1,463cm~!, corresponding to 
C-C aromatic stretches. C-F stretches, which resonate at 1,268cm~!, are present 
in the spectra of both polymers and appear weaker compared to the spectrum 
of 1, as expected for partial F substitution. Finally, the IR spectra of P-CDP and 
NP-CDP exhibited O-H stretches near 3,330 cm, aliphatic C-H stretches around 
2,930cm™!, and an intense C-O stretch at 1,030 cm~!, which are spectral features 
of intact 3-CD (Extended Data Fig. 2). Solid-state '*C NMR spectra of P-CDP 
and NP-CDP exhibited resonances associated with 8-CD at 6=72 and 100 p.p.m. 
(Extended Data Fig. 3). Peaks at 6=95 and 140 ppm correspond to the newly 
formed alkoxy groups and aromatic carbons, respectively. 

Water regain analysis. P-CDP or NP-CDP (100 mg) were dispersed in deionized 
H,0 (10 ml) for 1h and then filtered using 111m Whatman filter paper. The solids 
were collected and blotted using additional Whatman filter paper, and weighed. 
The water regain (expressed as weight per cent) of each polymer was determined 
from the average of two measurements using the following equation: 


Ww Wa x 100 
Wa 


Water regain = 


where wy (mg) and wg (mg) are the masses of the wet and dry polymer, respectively. 
Batch adsorption kinetic studies. Adsorption kinetic studies were performed 
in 20 ml scintillation vials equipped with magnetic stir bars. All studies were 
conducted at ambient temperature on a stirring hot plate adjusted to provide a 
2501.p.m. stirring rate. 

In studies involving P-CDP and NP-CDP, the polymer (18 mg) was initially 
washed with H2O for 2-3 min and then filtered on 11j1m Whatman filter paper. 
The polymer was then transferred to a 20 ml scintillation vial and then a pollutant 
stock solution (18 ml) was added. The mixture was immediately stirred and 2 ml 
aliquots of the suspension were taken at certain intervals via syringe and filtered 
immediately by a Whatman 0.2 1m inorganic membrane filter. The residual con- 
centration of the pollutant in each sample was determined by UV-vis spectroscopy. 
In studies involving EPI-CDP, NAC, GAC and Brita AC, the adsorbent (6 mg) was 
added to a 20 ml scintillation vial and then a pollutant stock solution (6 ml) was 
added. The vial was stirred for a measured amount of time before the suspension 
was filtered using a Whatman 0.2 1m inorganic membrane filter. 

The concentrations of pollutants in stock solutions as well as in the filtrates 
were characterized by UV-vis spectroscopy, based on calibration with their 
measured molar extinction coefficients (¢ in units of M~!cm~!), which were 
determined for bisphenol A (3,343 at Amax = 276nm), bisphenol S (20,700 at 
Amax = 259 nm), 2-naphthol (4,639 at Amax=273 nm), 1-naphthyl amine (5,185 
at Amax = 305 nm), 2,4-dichlorophenol (2,255 at \max = 284nm), and metolachlor 
(213 at Amax = 15,330 nm) pollutants. ¢ values of ethinyl oestradiol (8,430 at 
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Amax = 220 nm) and propranolol hydrochloride (5,310 at Amax = 290 nm) were 
reported elsewhere*!?. 

The efficiency of pollutant removal (in %) by the sorbent was determined by 
the following equation: 


Co-C 


Pollutant removal efficiency = x 100 


Co 
where Cy (mmol 1~!) and C; (mmol I~!) are the initial and residual concentration 
of pollutant in the stock solution and filtrate, respectively. 
The amount of pollutant bound to the sorbent was determined by the following 
equation: 


(Co — C1)Mw 
m 


t 


where q; (mg g ') is amount of pollutant adsorbed per g of sorbent at time ¢ (min). 
Co (mmol I~?) and C; (mmol I~’) are the initial and residual concentration of pol- 
lutant in the stock solution and filtrate, respectively. m (g) is the mass of sorbent 
used in the study. My (g mol‘) is the molar mass of the pollutant. 

The uptake rate of each adsorbent was best described by Ho and McKay’s 
pseudo-second-order adsorption model*?, shown in the following equation in a 
common linearized form: 


t t 1 
+ 2 
Kobs é 


where q; and q. are the adsorbate uptakes (mg adsorbate per g polymer) at time 
t (min) and at equilibrium, respectively, and kop; is an apparent second-order 
rate constant (g mg! min“). 

Flow-through adsorption experiments. Individual pollutants at high (mM) con- 
centrations. 3.0 mg of the adsorbent was stirred in 3 ml deionized HO for 2-3 min, 
then the suspension was pushed by a syringe through a Whatman 0.2 1m inorganic 
membrane filter to form a thin layer of the adsorbent on the filter membrane. 3 ml 
of the pollutant stock solution was then pushed through the adsorbent over 20s 
(8-9 ml min“! flow rate). The filtrate was then measured by UV-vis spectroscopy 
to determine the pollutant removal efficiency. 

Mixture of pollutants at environmentally relevant (jug 1-1) concentrations. 15 mg of 
the adsorbent (P-CDP or NAC) was added into a 20 ml vial and 5 ml nanopure 
water was added to prepare 3g1' stock suspension. Then 0.1 ml of the suspension 
was pushed through a Whatman 0.2 1m inorganic membrane filter with a syringe 
to form a thin layer of the adsorbent on the membrane. 8 ml of the diluted mixture 
(100 jg 1! bisphenol A, 2.5 jug 1! BPS, 50 ug 1! ethinyl oestradiol, 100 jug 17! 
propranolol, 5 jg 1~! metolachlor, 5 jg 1! 1-Na, 25 jug 1-1 2-NO, and 2.5 pg 1! 
2,4-DCP) was then pushed through the adsorbent over approximately 20s 
(25ml min“! flow rate). The experiments were conducted in duplicate/triplicate. 
The filtrate was then measured by HPLC-MS (Supplementary Table 2). 


Thermodynamic studies of adsorption. 4.0 mg of sorbent was initially washed 
with 3 ml of deionized H2O for 2-3 min and then filtered on a Whatman filter 
paper. Then the solid was transferred to a 4ml vial equipped with a stirring bar, 
and 2 ml (for 2mgmlI“! studies) or 4 ml (for 1mg ml"! studies) of pollutant stock 
solution was added, and the suspension was stirred for 10 min to reach equilibrium. 
The suspension was then filtered on a Whatman 0.2 {1m inorganic membrane filter, 
and the filtrate was measured by UV-vis spectroscopy. 

A Langmuir adsorption isotherm*** was generated by plotting 1/q. versus 1/c 
in the following equation: 


1 1 1 
= + 
Ie Imaxe 


Ke 


Imaxe 


where q. (mg g') is the amount of pollutant adsorbed at equilibrium, qmax,e 
(mg g_') is the maximum adsorption capacity of adsorbent at equilibrium, 
c (mol 17!) is the residual pollutant concentration at equilibrium, and K (mol~!) 
is the equilibrium constant. 

P-CDP regeneration experiments. 10 mg of P-CDP was initially soaked in 5 ml 
deionized HO for 5 min, and then filtered on a Whatman filter paper. The polymer 
was then transferred to a 20 ml scintillation vial equipped with a magnetic stir bar, 
to which a bisphenol A stock solution (10 ml, 0.1 mM) was added. The mixture was 
stirred at RT for 10 min, and then filtered on a Whatman filter paper. The residual 
bisphenol A concentration in the filtrate was measured by UV-vis. P-CDP was 
regenerated by soaking in MeOH (10 ml) for 5 min and recovered by filtration. 
This adsorption/desorption cycle was performed five times to generate the plot 
shown in Fig. 2c. The MeOH washing filtrate from the first cycle was concentrated 
under vacuum, and the residual solid was dissolved in 10 ml deionized HO and 
measured by UV-vis to determine the amount of recovered bisphenol A that was 
adsorbed on the polymer. 
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Extended Data Figure 1 | Porosity measurements of commercial ACs. 
These are the materials used in Fig. 2. Shown are the N; sorption isotherm 
(77 K, left column) and the cumulative pore size distribution (right 
column) of Brita AC (a), GAC (b) and NAC (c). The cumulative pore size 
distributions of each adsorbent are similar to that of P-CDP (Fig. 1c). 
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Extended Data Figure 2 | Infrared spectra of the cyclodextrin polymers 
and monomers. Spectra are labelled by chemical structure or compound 
name (top trace is 1, second trace down is B-CD). The FT-IR spectra 
shown in this figure of P-CDP and NP-CDP reflect the incorporation of 
B-CD and 1. 
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Extended Data Figure 3 | '*C CP-MAS solid-state NMR spectra of 
P-CDP, NP-CDP, B-CD and 1. The spectra of P-CDP and NP-CDP exhibit 
resonances associated with 8-CD at 6=72 and 100 p.p.m. (labelled a and 
b, respectively). Resonances at 6=95 and 140 p.p.m. (labelled e and c) 
correspond to the newly formed alkoxy groups and aromatic carbons, 
respectively. The spectrum of 1 is broadened because of !°F-!3C coupling. 
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Extended Data Figure 4 | Characterization of the bisphenol A uptake 
rate by each adsorbent. UV-vis spectra recorded at different contact 
times (coloured traces; left column) and pseudo-second-order plots (right 
column) for P-CDP (a), NP-CDP (b), EPI-CDP (c), Brita AC (d), GAC (e) 
and NAC (f). t (in min) is the contact time of bisphenol A solution with the 
adsorbent, and Q, (in mg g~') is the amount of bisphenol A adsorbed per 


gram of adsorbent. 
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Extended Data Figure 5 | Langmuir isotherm of bisphenol A adsorption 
by P-CDP. The equilibrium uptake of bisphenol A, q, (in mg g'), by 
P-CDP as a function of bisphenol A residual concentration (C, in mol 1!) 
fits the Langmuir model, which is consistent with the formation of 1:1 
inclusion complexes with an association constant (K) of 56,500 L mol”!, 
and an 88 mg g ' maximum equilibrium adsorption capacity (qmax,)- 
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Extended Data Figure 6 | Uptake of other pollutants by P-CDP. UV-vis spectra recorded as a function of contact times with P-CDP (1 mg ml~'). a, BPS 
(0.1 mM); b, metolachlor (0.1 mM); ¢, ethinyl oestradiol (0.04 mM); d, propranolol hydrochloride (0.09 mM); e, 2-NO (0.1 mM); f, 1-NA (0.1 mM); and 


g, 2,4-DCP (0.1 mM). 
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Extended Data Table 1 | Water regain analysis of PP-CDP and NP-CDP 


Dry pore volume | H,O regain Volumetric H,O 
(cm3/g) (wt%) regain (cm3/g) 


P-CDP takes up more water than NP-CDP, yet the latter swells to a greater degree, presumably because 
of its decreased crosslinking density. 
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Extended Data Table 2 | Rates of bisphenol A uptake by each adsorbent 


: Correlation Time to reach 
Sorbent k,,, (g/mg min) . ae ‘ 
coefficient R? _| equilibrium (min 
P-CDP 
NP-CDP 
EPI-CDP 
NAC 
GAC 
Brita AC 
% Uptake in | % Uptake at | % Equilibrium 
Sorbent Jose! : 
10 sec equilibrium in 10 sec? 
P-CDP 
NP-CDP 
EPI-CDP 
NAC 
GAC 
Brita AC 
Top: A comparison of the apparent second-order rate constants kop; of the bisphenol A (initial [BPA] =0.1 mM) 
uptake by each adsorbent (1 mgmI~}), correlation coefficients (R*) of the fit to the pseudo-second-order 
inetic model, and the required contact time (in minutes) to reach equilibrium. These data indicate that 
P-CDP removes bisphenol A with an apparent second-order rate constant kop; that is 15-200 times higher 
than the other adsorbents. P-CDP reaches 95% of its equilibrium bisphenol A uptake within 10s. Bottom: 
Comparison of percentage bisphenol A removal efficiency by each adsorbent after 10s and at equilibrium, 
ercentage equilibrium removal efficiency after 10s, and the amount qe, in units of milligrams per gram, 


of bisphenol A adsorbed by each adsorbent at equilibrium. 

@This assumes that the equilibrium value is reached within 60 min. 
6a equilibrium value that is achieved in 10s. 

“Amount of BPA adsorbed at equilibrium (mg BPA per g adsorbent). 
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Extended Data Table 3 | Equilibrium uptake of each pollutant by P-CDP 


% Uptake at 
Pollutant q, (mg g"')? 
equilibrium? 


metolachlor 
ethinyl estradiol 
propranolol 


The uptake percentage and amount of bound micropollutant (qe) were determined from the data shown 
in Extended Data Fig. 6. 

@Determined using the % uptake value after 10 min from the adsorption kinetic studies. 

’Amount of BPA adsorbed at equilibrium (mg BPA per g adsorbent). 
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Extended Data Table 4 | Adsorption equilibrium constants for each 
micropollutant onto P-CDP 


Pollutant K (M"!)2 


ee 


The adsorption equilibrium constant K for bisphenol A uptake was determined using a Langmuir 
adsorption isotherm (Extended Data Fig. 5). The adsorption equilibrium constants for the other 
micropollutants were estimated from the equilibrium uptake values observed in the kinetic studies 
(Extended Data Fig. 6). 

@Adsorption equilibrium constant. 

‘Calculated from Langmuir adsorption isotherm (Methods section ‘Thermodynamic studies of adsorption’). 
“Estimated from the equilibrium uptake observed at late time points of kinetic adsorption experiments. 
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Iron-catalysed tritiation of pharmaceuticals 


Renyuan Pony Yu', David Hesk?, Nelo Rivera’, Istvan Pelczer! & Paul J. Chirik! 


A thorough understanding of the pharmacokinetic and 
pharmacodynamic properties ofa drug in animal models is a critical 
component of drug discovery and development'~°. Such studies are 
performed in vivo and in vitro at various stages of the development 
process—ranging from preclinical absorption, distribution, 
metabolism and excretion (ADME) studies to late-stage human 
clinical trials—to elucidate a drug molecule’s metabolic profile and 
to assess its toxicity”. Radiolabelled compounds, typically those 
that contain 'C or °H isotopes, are one of the most powerful and 
widely deployed diagnostics for these studies*®. The introduction 
of radiolabels using synthetic chemistry enables the direct tracing 
of the drug molecule without substantially altering its structure 
or function. The ubiquity of C-H bonds in drugs and the relative 
ease and low cost associated with tritium (*H) make it an ideal 
radioisotope with which to conduct ADME studies early in the drug 
development process”*®. Here we describe an iron-catalysed method 
for the direct 3H labelling of pharmaceuticals by hydrogen isotope 
exchange, using tritium gas as the source of the radioisotope. The 
site selectivity of the iron catalyst is orthogonal to currently used 
iridium catalysts and allows isotopic labelling of complementary 
positions in drug molecules, providing a new diagnostic tool in drug 
development. 

Deuterium- and tritium-labelled compounds find widespread appli- 
cation in the pharmaceutical industry because of their ability to alter 
metabolism”, to exploit kinetic isotope effects and most commonly to 
perform ADME studies required for registration'"°. The most common 
methods for preparation of *H-radiolabelled drug molecules include 
reduction of a functional group to a C-7H bond using stoichiometric 
reagents or by catalytic methods involving direct hydrogen isotope 
exchange (HIE)”*°. With the latter approach, hydrogen atoms in the 
drug molecule or an advanced intermediate undergo exchange with 
tritium gas (7H2) or tritiated water (7H,O) in the presence of a tran- 
sition metal catalyst. Whereas both 7H, and *H)O have been applied 
in drug discovery and ADME studies, *H) is preferred over 7H,O for 
three reasons: the latter is prepared from tritium gas, is known to 
undergo decomposition to its constituent elements by autoradiolysis, 
and presents challenges to safe handling owing to a relatively high 
radioactivity-to-volume ratio requiring commercial sources to be 
diluted with natural abundance water’. The higher isotopic purity and 
atom efficiency associated with 7H enables the preparation of radi- 
olabelled drug compounds with higher specific activity and further 
demonstrates the preference for tritium gas for HIE. Because many 
known precious metal catalysts used for HIE of organic substrates 
and drug molecules rely on water®!°"! or organic molecules’! as the 
source of the hydrogen isotope, the discovery of new catalyst tech- 
nology that enables HIE using *H2 gas would be transformative for 
ADME studies’. 

Among precious metal catalysts used for isotopic labelling of phar- 
maceuticals, Crabtree’s iridium catalyst, [(COD)Ir(py) PCy3]PF« 
(COD = 1,5-cyclooctadiene; py = pyridine), a compound initially 
developed for alkene hydrogenation", and Kerr’s iridium-carbene 
catalysts'° are the most widely used (Fig. 1)!°~'®. These catalysts and 
other Ir(1) variants typically operate in a limited range of solvents, and 


over the past two decades efforts have been focused on improving their 
stability and compatibility with functional groups commonly present 
in pharmacueticals’*!>'*!®-!°. The site selectivity of C-H exchange is 
predictable and relies on directing groups to enable reactivity of the 
ortho positions in aromatic and heteroaromatic rings. Depending on 
the specific target and purpose of the metabolic studies, isotopologues 
and isotopomers of the radiolabelled drug molecules—where the radi- 
olabels are introduced at different locations in the molecule and in 
different amounts—may be required®®. Accordingly, new catalysts that 
introduce tritium at sites orthogonal to those accessed by iridium and 
with high degrees of incorporation and hence specific activities are 
valuable in augmenting existing technologies and in providing new 
diagnostics for ADME studies. 

The bis(arylimdazol-2-ylidene)pyridine iron bis(dinitrogen) com- 
plex”’ has recently been shown by our laboratory to be an exception- 
ally active base metal catalyst for the hydrogenation of unactivated, 
essentially unfunctionalized alkenes”!”. In situ monitoring of the var- 
iant bearing saturated N-heterocyclic carbenes, (Hy-'""CNC)EFe(N>)2 
(Hy? *CNC =2,6-(2,6-'Pr>-CH3-4,5-H2-imidazol-2-ylidene)2CsH3N), 
1, during the course of alkene hydrogenation in perdeuterated benzene 
demonstrated C-H isotopic exchange between the sp” C-7H bonds 
of the solvent and free H, gas. Quantitative 3C{'H} NMR spectros- 
copy (126 MHz) established formation of CoH,7H6_x isotopologues 
with diagnostic chemical shifts (Aé = 37-65 Hz) and multiplicity 
(Sbenzene-ds = 128.06 p.p.m., 7Jc.29 = 24 Hz, see Supplementary 
Information for representative spectrum). To explore the generality and 
the potential for tritium exchange reactions, additional representative 
substrates were evaluated with the iron catalyst. 


a State-of-the-art technology for tritium labelling 


[Ir] catalyst, SH, gas 3H 
DG (Most common: Crabtree’s catalyst) DG 
o - 
+ Limited solvent compatibility 


* Requires directing groups 


'Pr 


b Complementary technology (this work) 


[Fe] catalyst (1), °H2 gas 


DG 
DG N Fe 
SS 
“dae ore il Ne 
+ Superior solvent compatibility 3H N 'Pr N Sn 


* Directing group not needed 


Fe catalysed HIE 1 


Figure 1 | Homogeneous transition metal-catalysed hydrogen/tritium 
exchange using tritium gas. a, Ir catalysts are widely used for the tritium 
labelling of pharmaceuticals; these catalysts operate through directed 
ortho exchange in the presence of a directing group (DG); Crabtree’s 
catalyst (right) is the most popular and only operates in CH2Cl). More 
recently, Kerr and co-workers developed a class of Ir-carbene variants 
that exhibit superior catalyst stability, efficiency and functional group 
compatibility’>'®. b, Complementary technology: iron catalyst (1) as an 
alternative tritiation catalyst that exhibits orthogonal C-H bond selectivity 
that is determined by steric accessibility and C-H bond acidity. Directing 
groups are not required and the catalyst operates in a range of solvents 
including THF, DMF and NMP. 


1Department of Chemistry, Princeton University, Princeton, New Jersey 08544, USA. Merck Research Laboratories, Rahway, New Jersey 07065, USA. 
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Figure 2 | Fe-catalysed deuterium labelling of representative arenes 
and heteroarenes. a, Here selectivity for isotopic exchange is governed by 
steric accessibility; percentages in parentheses correspond to the extent 
of deuterium incorporation at the designated positions; for 12 and 19, 


Catalyst evaluation studies were initially conducted with *H» gas 
as a more convenient and readily handled *H; surrogate. The iron- 
catalysed HIE of arenes and heteroarenes was initially examined (Fig. 2). 
All of the catalytic reactions were conducted under standard condi- 
tions employing 1 mol% of 1, 3.02 M substrate in THF with 4 atm of 
°H2 at 45°C. The extent of isotopic exchange was determined after 
24h using a combination of quantitative 'H and quantitative '°C{'H} 
NMR spectroscopies. The latter analytical technique has been espe- 
cially informative for substrates where multiple aromatic C-H signals 
are poorly resolved in 'H NMR spectra. Iron pre-catalyst 1 has proven 
particularly effective for the 'H/°H exchange with many important 
substructures in pharmaceuticals, including substituted arenes (2-7), 
pyridine derivatives (8-12) as well as other nitrogen-, oxygen- and 
sulphur-containing heteroarenes (13-21), Examination of the site 
selectivity of the iron-catalysed deuteration of arenes (2-7) established 
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3 mol% catalyst loading was employed. b, Selectivity comparison between 
1 and Crabtree’s iridium catalyst. c, Inverse pressure dependence exhibited 


by 1. 


two salient trends—namely, electron-poor substrates (6, 7) undergo 
C-H isotopic exchange faster than electron-rich ones (for example, 
3-5), and the site selectivity of the C-H bond activation occurs at the 
most sterically accessible C-H bonds. The iron-catalysed method is 
also versatile within various classes of nitrogen heterocycles. With 
(—)-nicotine (12) exclusive deuteration of the sterically accessible posi- 
tions of the pyridine occur in the presence of the N-methyl pyrrolidine. 
With 2,6-lutidine (10), the 4-position is the only sp? hybridized C-H 
bond to undergo isotopic exchange. Notably, significant (40%) deut- 
erium incorporation was observed in the sp’ benzylic positions, a fea- 
ture absent in 2-methylpyridine (8) and 2-ethylpyridine (9). In the case 
of N-methyl-indole (20) and imidazo[1,2-c]pyridine (21), exchange 
occurred at positions adjacent to the ring junctions. 

The complementary selectivity of 1 as compared to Crabtree’s cata- 
lyst is highlighted by the deuteration of N,N-dimethylbenzamide (22). 
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Figure 3 | Fe-catalysed deuterium labelling of drug molecules. 
a, Direct labelling of drug substrates, general catalytic reaction conditions 
as follows: 10 mol% 1, 0.154 mmol substrate, 0.5 ml NMP solution, 1 atm 
7H», 45°C, 24h. b, Deprotonation-protection strategy of carboxylic acids 


As shown in Fig. 2b, the iridium catalyst promoted isotopic exchange 
exclusively at the ortho positions as expected for an amide-directed 
C-H activation. In contrast, the iron catalyst enabled deuterium 
incorporation statistically at the sterically accessible meta and para 
positions. Precious metal catalysts exhibiting sterically preferred 
C-H selectivity are well documented, although in most cases 7H,O 
or organic solvents serve as the source of the isotopic label!®!!°. An 
iridium example, [(1°-CsMes)(Me3P)Ir(CH3)CH2Cl,] [BAr"4] (where 
[BAr* 4] = tetrakis(3,5-bis(trifluoromethyl)phenyl)borate)) has been 
reported whose selectivity is determined by sterically accessible C-H 
bonds and employs *H2 or *H2 gas but requires a stoichiometric rather 
than catalytic amount of metal complex”. Because practical tritiation 
of pharmaceuticals are typically conducted with a modest excess of 
tritium gas (that is, at subatmospheric pressure)®, the deuteration of 
toluene catalysed by 1 was evaluated at both 1 atm and 0.35 atm of 7H» 
gas. Higher activity was observed at the lower pressures of *H2 gas, as 
42% and 52% deuterium incorporation was observed in the para and 
meta positions after 6h. These values were reduced to 17% and 22%, 
respectively, when the pressure was increased to 1 atm (Fig. 2c). 

The promising functional group compatibility and the favourable 
inverse pressure dependence observed with 'H/*H exchange with 1 
prompted evaluation of the deuteration and tritiation of pharmaceu- 
ticals (Fig. 3). Although THF was used for catalyst evaluation studies 
with representative arenes and heteroarenes, the limited solubility 
of more functionalized pharmaceuticals in this solvent inspired use 
of other media for catalytic HIE. Iron catalyst 1 has proven robust 
and operated efficiently in a range of polar aprotic solvents includ- 
ing dimethylacetamide (DMA), dimethylformamide (DMF) and 
N-methylpyrrolidinone (NMP)—the last being used for all subsequent 
drug labelling studies. With 10 mol% 1, selective deuteration of papa- 
verine (27), an opium alkaloid antispasmodic”, was achieved where 


MK-8228 (Merck) 
Human cytomegalovirus 
present in 34 and 37 for compatibility with the iron catalyst. For 37, 
the same deprotonation strategy was employed, but the carboxylate salt 
was generated in situ and used without further isolation. 


exclusive deuterium incorporation was observed at the 3-position of 
the isoquinoline fragment. This site selectivity was as expected on the 
basis of the small molecule studies—this C-H bond is the only sterically 
accessible position for the iron catalyst. Other commercially available 
pharmaceuticals, namely varenicline, loratadine, cinacalcet and flu- 
menzil, underwent iron-catalysed deuteration at the predicted sterically 
accessible sites. With loratadine, isotopic exchange was preferred at the 
3-position over the 2-position of the ring, contrary to typical C-H func- 
tionalization preferences. With the experimental orexin receptor antag- 
onist, MK-6096 (30)7’, quantitative deuterium incorporation occurred 
at the 3-position of the 1,2,5-trisubstituted benzene subunit, probably 
through ortho-directed C-H activation from the nitrogen atom of the 
proximal pyrimidine ring. This preferred selectivity demonstrates that 
in the presence of appropriate ligands, directed C-H activation can 
be preferential to steric site selectivity. Pharmaceuticals such as the 
CRTH2 antagonist, MK-7246 (34)?8, and the anti-viral drug candi- 
date, MK-8228”°, proved incompatible with 1, probably as a result of 
the carboxylic acid functional group. Preparation of the NMP soluble 
conjugate sodium carboxylate salt of MK-7246, 35, readily obtained by 
addition of NaH to the free acid, overcame this limitation. Following 
acidic workup after iron-catalysed deuteration provided the isotopically 
enriched product, 36, in high yield with deuterium selectively incor- 
porated into the C-H bonds ortho to the fluorine. As illustrated with 
MK-8228 (37), a Merck developmental drug candidate for treatment 
of Human Cytomegalovirus (HCMV) infections in transplant patients 
currently undergoing Phase III clinical trials, the carboxylate salt was 
generated in situ and used in the HIE reaction without isolation or 
further purification. 

To demonstrate the utility and to further highlight the complemen- 
tarity of the iron catalyst with existing iridium-catalysed methods, a 
series of representative, commercially available pharmaceuticals were 
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Figure 4 | Tritium labelling of drug molecules. a, Using 1, reaction 
conditions as follows: 25 mol% catalyst loading, 7 j.mol substrate, 

1.2 Ci 7H) (0.15 atm), 0.2 ml NMP, 23°C, 16h. b, Using Crabtree’s 
catalyst, reaction conditions as follows: 25 mol% catalyst loading, 

7 mol substrate, 1.2 Ci tritium (0.15 atm), 0.5 ml CH2Ch, 23°C, 16h. 

For [3H]34a, the sodium salt conjugate base, 35 was used in place of 

34 owing to incompatibility of the carboxylic acid functionality with 1 
(see Supplementary Information for details). Specific activities for each 
compound are given in Ci mmol !. *A comparison using another Ir-based 
catalyst, [(COD)Ir(IMes)PPh3]PF, (see Supplementary Information for 
catalyst representation), originally developed by Kerr and co-workers!*}8, 
was performed using MK-6096 as the model substrate; the Kerr catalyst 
labelled the same position as Crabtree’s catalyst with a lower specific 
activity of 8.8 Cimmol ! under identical conditions. 


tritiated using both 1 (Fig. 4a) and Crabtree’s iridium catalyst (Fig. 4b). 
Use of 7H> gas was particularly informative, to not only determine site 
selectivity but to also establish the relative specific activity of the final 
product between the base and precious metal C-H exchange cata- 
lysts. In all cases, tritium exchange reactions were performed using 
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subatmospheric pressures (~0.15 atm, approximately 12 equiv. 7H 
atoms with respect to substrate) of 7H» gas at room temperature at 
higher (25 mol%) catalyst loading, and the positions of labelling were 
confirmed by 3H NMR spectroscopy. With MK-6096 ([3H]30a), 
flumazenil ([°H]32a) and suvorexant ([3H]33a), tritium was detected 
in trace quantities from iron-catalysed isotopic exchange in positions 
previously unobserved in the deuteration experiments (Fig. 4a). These 
positions are adjacent to arene substitution such as methyl groups or 
ring junctions where C-H activation has a higher barrier due to steric 
inaccessibility. The high sensitivity of 7H NMR spectroscopy as com- 
pared to the corresponding 'H and °C experiments allowed detection 
of the small quantities of isotopic label introduced into these positions. 
For suvorexant ([°H]33a), the iron and iridium exchange methods are 
complementary; the iron catalyst prefers isotopic exchange at the rela- 
tively electron deficient triazole, whereas iridium favours directed C-H 
functionalization and tritium incorporation in the aryl ring at the site 
ortho to the triazole subunit. Similar orthogonal yet complementary 
site selectivity was observed with MK-7246 where after carboxylate 
deprotonation, the iron catalyst enables sterically driven tritiation of the 
fluorinated arene ring while iridium promoted directed C-H exchange 
exclusively in the saturated nitrogen heterocycle. 

The tritiation of flumazenil highlights the beneficial reactivity ena- 
bled by 1. Sterically accessible C-H bonds ortho to fluorine in the 
arene fragment resulted in a relatively high specific activity of 16.1 
Ci mmol! ([?H]32a). Typically, specific activity values in the range 
10-20 Ci mmol“! are acceptable for ADME studies*”. With Crabtree’s 
iridium catalyst, no tritiation of flumazenil was observed under sim- 
ilar conditions (Fig. 4b, [3H]32b). Although amide and ester func- 
tionalities are present that could potentially serve as directing groups 
and hence enable isotopic exchange, conformational effects probably 
inhibit the accessibility of proximal C-H bonds required for tritia- 
tion. Significantly higher specific activity was observed with the iron- 
catalysed tritiation of MK-6096. This difference may be traced to the 
abundance of sterically accessible C-H bonds where the iridium cat- 
alyst only introduces the isotopic label at one site (Fig. 4b, [7H]30b). 

In summary, an iron catalyst for the tritiation of pharmaceuticals 
has been discovered with selectivity for C-H bonds that is orthogonal 
and complementary to existing precious metal catalyst technology. 
This catalyst is compatible with a range of pharmaceutically relevant 
functional groups and operates efficiently in polar aprotic solvents at 
low pressures of tritium gas. The ability to introduce radiolabels into 
unique positions provides a new diagnostic for ADME studies, a critical 
component of the drug approval process. Current efforts are devoted to 
elucidating the mechanism of action and to improve the stability and 
handling of the iron precursor. 
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Critical insolation-CO, relation for diagnosing past 
and future glacial inception 


A. Ganopolski!, R. Winkelmann!” & H. J. Schellnhuber!* 


The past rapid growth of Northern Hemisphere continental 
ice sheets, which terminated warm and stable climate periods, 
is generally attributed to reduced summer insolation in boreal 
latitudes!->. Yet such summer insolation is near to its minimum at 
present’, and there are no signs of a new ice age’. This challenges 
our understanding of the mechanisms driving glacial cycles 
and our ability to predict the next glacial inception®. Here we 
propose a critical functional relationship between boreal summer 
insolation and global carbon dioxide (CO2) concentration, which 
explains the beginning of the past eight glacial cycles and might 
anticipate future periods of glacial inception. Using an ensemble 
of simulations generated by an Earth system model of intermediate 
complexity constrained by palaeoclimatic data, we suggest that 
glacial inception was narrowly missed before the beginning of the 
Industrial Revolution. The missed inception can be accounted 
for by the combined effect of relatively high late-Holocene CO, 
concentrations and the low orbital eccentricity of the Earth’. 
Additionally, our analysis suggests that even in the absence of 
human perturbations no substantial build-up of ice sheets would 
occur within the next several thousand years and that the current 
interglacial would probably last for another 50,000 years. However, 
moderate anthropogenic cumulative CO, emissions of 1,000 to 1,500 
gigatonnes of carbon will postpone the next glacial inception by 
at least 100,000 years®*. Our simulations demonstrate that under 
natural conditions alone the Earth system would be expected to 
remain in the present delicately balanced interglacial climate 
state, steering clear of both large-scale glaciation of the Northern 
Hemisphere and its complete deglaciation, for an unusually 
long time. 

In accordance with classical Milankovitch theory’, interglacials— 
warm intervals with the lowest global ice volume—occur during peri- 
ods of high summer insolation in the boreal latitudes of the Northern 
Hemisphere. In the past, a decrease in Northern Hemisphere insolation 
to below its present-day level always led to the end of interglacials and 
rapid growth of continental ice sheets”?, accompanied by a reduction 
in CO, concentration!®!!. However, at present, although summer inso- 
lation at 65° N is close to its minimum‘, there is no evidence for the 
beginning of a new ice age. On the contrary, sea level, which reflects 
changes in global ice volume, remained essentially constant over the 
past several millennia®!”. 

The most straightforward explanation for the lack of glacial inception 
at present is that the current insolation minimum is not deep enough 
because of the low orbital eccentricity of the Earth. However, glacial 
inceptions have occurred in the past under similar orbital configu- 
rations. Marine Isotope Stage (MIS) 11 (about 400,000 years before 
present, 400 kyr Bp) is often considered a close palaeo-analogue for the 
current interglacial (the Holocene, or MIS1) owing to the similarly low 
values of the eccentricity of Earth's orbit and similar CO) level at that 
time’ (Fig. 1). The only difference between the insolation minimum at 
about 400 kyr Bp and the present one is a lower obliquity during MIS11. 


With respect to the orbital parameters, MIS19 (about 800 kyr Bp) 
is an even closer analogue for the Holocene (see Fig. 1). Following 
this analogy, it has been suggested that the current interglacial would 
end naturally within the next 1,500 years if the CO, concentration had 
stayed at a level of about 240 parts per million (p.p.m.), as was the case 
at the end of MIS19 (ref. 13). However, during the late Holocene before 
the beginning of the industrial era, the CO2 concentration was about 
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Figure 1 | Orbital parameters. Comparison of Earth’s orbital parameters 
and CO) concentrations for MIS1 (green), MIS11 (blue) and MIS19 
(black). The vertical dashed line corresponds to the present day for MIS1 
and the minima of the precessional component of insolation for MIS11 
and MIS19. 
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Figure 2 | Evolution of the Northern Hemisphere ice volume. Ice volume 
(excluding the Greenland Ice Sheet) is given in metres of sea-level 
equivalent for the Holocene and near future with a CO, concentration of 
280 p.p.m. (a), the Holocene and near future with a CO) concentration 
of 240 p.p.m. (b), MIS11 with a CO, concentration of 280 p.p.m. (c), and 
MIS19 with a CO concentration of 240 p.p.m. (d). Individual model 


280 p.p.m., a level that is also typical for several previous interglacials. 
Therefore, MIS19, with its low CO, level, may not be a proper analogue 
for the present interglacial either. 

A key question we address here is whether subtle differences in 
orbital configurations and CO, concentration between MIS11 and 
MIS19 and the Holocene are sufficient to explain the fundamentally 
different evolution of the climate system in the vicinity of its present 
state. This could have important implications for the future evolution 
of the Earth system. 

The results presented here are based on simulations with the Earth 
system model of intermediate complexity CLIMBER-2 (ref. 14), 
which includes the three-dimensional thermomechanical ice sheet 
model SICOPOLIS!*. The CLIMBER-2 model has been successfully 
applied for simulating the last eight glacial cycles’®. It is known that 
glacial inception is associated with highly nonlinear dynamics of the 
climate—cryosphere system’. Previous studies*'* demonstrated high 
sensitivity of the timing of the next glacial inception to modelling 
parameters and prescribed CO, concentration. This is consistent 
with the finding that glacial inception represents a bifurcation tran- 
sition between interglacial and glacial climate states and that under 
the current orbital configuration and CO) concentration, the Earth 
system is very close to this bifurcation point!””°. To ensure that the 
model correctly simulates the position of this bifurcation transi- 
tion in the phase space of external forcings, we use the well estab- 
lished fact that glacial inception occurred at the end of MIS19 and 
MIS11, but not in the recent past. We quantify these constraints as 
following: for MIS19 and MIS11 only model versions that simulate a 
build-up of ice sheets with a total volume of at least 5 m sea-level 
equivalent are accepted. For the Holocene, on the contrary, all model 
versions simulating more than 1 m sea-level equivalent of ice growth 
before today are rejected. Note that these constraints are only applied 
to the Northern Hemisphere and do not include the Greenland 
Ice Sheet. 


simulations from the subset of model versions compatible with the 
empirical constraints are shown; the shading illustrates the entire range. 
For MIS1 (the Holocene), with a CO, concentration of 280 p.p.m. (a), 
the model realizations that are not compatible with the observational 
constraints are shown as grey lines. The vertical dashed lines correspond 
to the minima of the precessional component of insolation. 


By perturbing one of the model parameters, which affects the surface 
temperature in a uniform manner, we created an ensemble of twenty 
model realizations that differ only slightly in their unperturbed cli- 
mate state (see Methods). We then applied these model realizations to 
simulate MIS19, MIS11 and the Holocene forced by changing orbital 
parameters and the prescribed CO) concentrations typical for each 
interglacial (240 p._p.m. for MIS19, and 280 p.p.m. for MIS11 and the 
Holocene). We selected those realizations that are compatible with the 
observational constraints described above. Only four model versions 
pass this test successfully. 

None of these model versions simulate substantial ice sheet growth 
within the next several thousands of years under the pre-industrial 
CO), concentration of 280 p.p.m. (Fig. 2). Three of the four model ver- 
sions predict the next glacial inception about 50,000 years from now, 
which is consistent with previous modelling results. Only the coldest 
model version simulates a slow growth of medium-sized ice caps in the 
future, mostly over the Canadian Archipelago and Scandinavia. This 
slow growth is attributed to decreasing obliquity, which becomes the 
dominant astronomical factor under the currently very low eccentricity. 
However, even in this case, the ice volume only crosses the glaciation 
threshold 20,000 years from now. 

The situation is completely different for a CO, concentration of 
240 p.p.m., which is close to that observed at the end of MIS19. In this 
case all four model versions simulate rapid ice growth several thou- 
sands of years before the present and large ice sheets exist already at 
the present time (Extended Data Fig. 1). This means that the Earth 
system would already be well on the way towards a new glacial state if 
the pre-industrial CO, level had been merely 40 p.p.m. lower than it was 
during the late Holocene, which is consistent with previous results!37!, 

Whether this narrow escape from glacial inception was natural 
remains debatable. It has been proposed”! that pre-industrial land- 
use at least partly contributed to the high Holocene CO; level, but the 
magnitude of this contribution is very uncertain”””?. This escape from 
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Figure 3 | Critical insolation-CO, relation. a, Best-fit logarithmic relation 
(black line) between the maximum summer insolation at 65° N and the CO, 
threshold for glacial inception; grey shaded area indicates +1 s.d. Blue dots 
correspond to the coldest model version and red dots to the warmest. b, The 
locations of previous glacial inceptions in insolation—CO) phase space relative 
to the best-fit logarithmic curve from a. Glacial inception is only possible when 
the point is located below the insolation-CO, curve. c, The timing of past and 
future glacial inceptions can be explained by the CO; concentration and the 
insolation—CO, relation. The thin grey line depicts the CO; threshold value for 


glacial inception in the recent past makes it very unlikely that glacia- 
tion will start in the near future without a considerable drop in CO 
concentration. That is, glaciation did not begin even under the most 
favourable conditions in the past millennium for it to do so, and we 
do not anticipate such conditions occurring again in the foreseeable 
future. This is apparent from a stability analysis of the Earth system in 
the phase space of orbital forcing and CO concentration. 

Using a range of model realizations consistent with palaeocli- 
matic constraints, we mapped the threshold value of CO? lead- 
ing to glacial inception depending on the maximum summer 
insolation at 65° N. The maximum summer insolation at 65° N is 
the most common metric for the orbital forcing. It has been shown 
that for given CO, concentration, the insolation threshold for gla- 
cial inception depends also on obliquity, but this dependence is 
rather weak!. For the sake of simplicity and consistency with our 
previous results, we therefore use maximum summer insolation at 
65° N as the proxy for orbital forcing. 

Figure 3a shows that the individual points in the insolation—CO, 
space representing different combinations of orbital parameters are 
clustered around the logarithmic curve. This is consistent with the 
fact that radiative forcing of CO: is proportional to the logarithm 
of CO, concentration and that in the CLIMBER-2 model, similar to 
many other climate models, the temperature response to CO and 
orbital forcing is linear within the considered range of CO, concen- 
trations. The critical summer insolation at 65° N can be described as 
§=aln([CO]/280) + 3, where a=—77 W m~’ and 3=466 W m~” and 
[COz] is the concentration of CO; in parts per million (see Methods). 
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glacial inception, derived as a function of the maximum summer insolation 

at 65° N. The CO) concentration from ice core data’!! for the past 800,000 
years is shown (blue line), along with the CO; scenarios of 0 Gt C cumulative 
anthropogenic emissions (blue line), 500 Gt C (orange line), 1,000 Gt C (red 
line) and 1,500 Gt C (dark red line). Pale blue vertical bars indicate the time 
periods when the reconstructed value is below the critical CO, concentration, 
and the light blue bar shows the timing of a possible next glacial inception. The 
horizontal dotted line indicates the present-day CO; level. The lower curve 
depicts a proxy for the global ice volume” (thick grey line). 


The validity of the critical insolation—-CO, concentration relation- 
ship is confirmed for all previous glacial inceptions, because each of 
the past glaciations occurred when the CO: concentration was lower 
than the threshold value (Fig. 3b, c). The conditions for glacial incep- 
tion are not met in the near future. For the orbital forcing of the late 
Holocene, the CO, concentration remains above the threshold value 
owing to the currently low eccentricity. Only in about 50,000 years 
from now, the threshold CO, value approaches the pre-industrial CO, 
concentration. Therefore, even without anthropogenic perturbations, 
the Holocene would be an unusually long interglacial. This also implies 
that the Holocene has no proper palaeoclimate analogue within at least 
the past million years. 

Owing to the extremely long lifetime of anthropogenic CO, in the 
atmosphere, past and future anthropogenic CO emissions will have a 
strong impact on the timing of the next glacial inception®’. In order to 
estimate the earliest possible onset of a new glaciation, we forced the 
four valid model realizations by orbital variations and different CO, 
concentration scenarios computed for the next 100,000 years with the 
CLIMBER-2 model (see Methods). Initial conditions were based on 
results from simulations of the last glacial cycle**. Without anthro- 
pogenic greenhouse gas emissions, the CO2 concentration gradually 
declines in the future and oscillates slightly on the orbital timescale. 
Note that the long-term natural future evolution of CO2 concentra- 
tion is sensitive to model parameters and is not well constrained by 
empirical data. Assuming that anthropogenic CO) emissions will be 
reduced to zero on a centennial timescale (not considering the possi- 
bility of negative CO2 emissions), the long-term evolution of the COz 
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Figure 4 | The next glacial inception. The top panel shows the temporal 
evolution of the maximum summer insolation at 65° N. The middle panel 
shows the simulated CO) concentration during the next 100,000 years 
for different cumulative CO emission scenarios: 0 Gt C anthropogenic 
emissions (blue), 500 Gt C (orange), 1,000 Gt C (red) and 1,500 Gt C (dark 
red line). The bottom panel shows simulated ice volume corresponding to 
the different CO, emission scenarios. Individual simulations are shown 
for the 1,500 Gt C scenario; for the other scenarios, the range is given as 
shading. 


concentration depends solely on the total cumulative carbon emissions 
within the next centuries. 

Under three scenarios with cumulative emissions of 500 gigatonnes 
of carbon (Gt C), 1,000 Gt C and 1,500 Gt C, we simulate the ice volume 
on the Northern Hemisphere for the next 100,000 years. Even for a 
total of 500 Gt C cumulative emissions, which is only slightly above the 
present-day value, the evolution of the Northern Hemisphere ice sheets 
is affected over tens of thousands of years (Fig. 4). In the 1,000 GtC 
scenario, the probability of glacial inception during the next 100,000 
years is notably reduced, and under cumulative emissions of 1,500 Gt C, 
glacial inception is very unlikely within the entire 100,000 years. This 
confirms our conclusions from the critical insolation threshold for gla- 
cial inception. Because all 2013 Intergovernmental Panel on Climate 
Change scenarios—except Representative Concentration Pathway 
2.6 (RCP2.6), which leads to the total radiative forcing of greenhouse 
gases of 2.6 W m ~” in 2100—imply that cumulative carbon emission 
will exceed 1,000 Gt in the twenty-first century, our results suggest that 
anthropogenic interference will make the initiation of the next ice age 
impossible over a time period comparable to the duration of previous 
glacial cycles. 

Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 

Model. For this study we used the Earth system model of intermediate complex- 
ity CLIMBER-2 (ref. 14), which incorporates the three-dimensional thermo- 
mechanical ice sheet model SICOPOLIS!°. SICOPOLIS is applied only to the 
Northern Hemisphere with a spatial resolution of 1.5° x 0.75° and is fully inter- 
actively coupled to a low-resolution climate component. Key model parameters 
were selected to reproduce the magnitude of global ice volume variations dur- 
ing the last glacial cycle. Using the same set of model parameters, the model was 
then successfully applied to simulate the last eight glacial cycles!®. The model also 
includes a global carbon cycle component that has been used to simulate CO2 
variations during the last glacial cycle”. In the present study we use CLIMBER-2 
in both configurations: (1) with interactive carbon cycle and prescribed modern 
ice sheets and (2) with interactive ice sheets and prescribed CO, concentration. 
Model ensemble. Although the glacial cycles simulated with the standard version 
of CLIMBER-2 are in good agreement with palaeoclimate data'® and satisfy our 
palaeoclimate constraints it is possible that other model versions with slightly dif- 
ferent model parameters are also compatible with palaeoclimate constraints. To 
assess how tightly palaeoclimate data constrain model parameters and to assess 
one of the sources of possible uncertainties in the CO2-insolation relationship 
we created an ensemble of twenty CLIMBER-2 realizations by perturbing one of 
the model parameters: namely, parameter C;, which controls the effective cloud 
height in the atmospheric model component", The advantage of this parameter is 
that it affects the equilibrium surface air temperature in a rather uniform manner, 
similar to changes in the CO) concentration. However, a number of other model 
parameters can be used for the same purpose. By varying this parameter by +5% 
we created a set of model realizations that differ by only +1°C in their global 
mean temperatures, a range that is comparable to the inter-model differences of 
CMIP-5 climate models”®. 

From the twenty-member ensemble we then selected those simulations that 
are in agreement with observational constraints provided by glacial inceptions 
at the end of MIS11 and MISS, and the lack of glacial inception in the recent 
past. To select the model realizations compatible with the empirical constraints 
we performed a set of simulations for MIS19, MIS11 and the current interglacial 
using the entire ensemble of the CLIMBER-2 model realizations. For the current 
interglacial, the model was run for 60,000 years into the future, starting 10,000 
years before present from an equilibrium state. The MIS11 and MIS19 simula- 
tions begin 10,000 years before the respective insolation minima (410 kyr Bp and 
790 kyr Bp, respectively), which triggered glacial inceptions. In all experiments 
the CO, concentration is fixed to the typical value for the given interglacial, that 
is, 280 p.p.m. for MIS11 and MIS1, and 240p.p.m. for MIS19. These numbers are 
representative of the end of interglacials for MIS19, MIS11 and the late Holocene. 
Only four model realizations (including the standard one) were found to be con- 
sistent with all three palaeoclimate constraints. 

Future simulations. Simulations of climate evolution over the next 100,000 years 
were performed in two steps. First, we computed the CO, concentration using the 
CLIMBER-2 model in the same configuration as in ref. 24. As initial conditions 
we used the state of the Earth system obtained at the end of the last glacial cycle 
simulation. (Note that the model simulates the present-day CO, concentration as 
close to the pre-industrial 280 p.p.m. level without any anthropogenic emissions.) 
Then the model was run for 100,000 years with the fully interactive carbon cycle 
forced by anthropogenic emissions and orbital parameters. Present-day ice sheets 
were prescribed during the entire runs. In the run without anthropogenic emis- 
sions, CO? gradually declines during the next 100,000 years owing to a small imbal- 
ance between volcanic outgassing and weathering. Assuming zero anthropogenic 
emissions beyond the next several centuries, the long-term CO, concentration 
depends only on the total cumulative carbon emissions. Therefore we used very 
simplistic emission scenarios which correspond to cumulative carbon emissions 
of 500 Gt C, 1,000 Gt C and 1,500 Gt C. For each emission scenario we performed 
one 100,000-year-long experiment with the standard version of the model. We 
then ran each of the four valid CLIMBER-2 model versions with the interactive 
ice sheets forced by the different CO scenarios. 

Insolation-CO, relationship. To find the relationship between insolation and 
CO, concentration we performed 20 simulations with 10 different orbital con- 
figurations and the coldest and warmest of the four valid model versions. Note 
that the difference in global surface air temperatures between these two model 
versions does not exceed 0.5°C under interglacial conditions. We chose orbital 
configurations that correspond to present-day conditions and eight previous min- 
ima of summer insolations, seven of which correspond to well defined glacial 
inceptions during the past 800 kyr. The last orbital configuration at 480 kyr Bp, 
which corresponds to one of the shallowest minima in summer insolation, was 
chosen to span a broader range of summer insolation. All simulations started 
from the same pre-industrial equilibrium states both for the climate components 


and ice sheets. Initial CO2 concentration was set to 500 p.p.m. for the coldest 
three orbits and 400 p.p.m. for the other five. The orbital parameters were kept 
constant during each experiment and CO) concentration decreased linearly with 
time at the rate of 1 p.p.m. per 1,000 climate model years (see Extended Data 
Fig. 2). To be as close to the quasi-equilibrium for the ice sheets as possible, we 
used an acceleration technique with asynchronous coupling between climate and 
ice sheets, which effectively increases the length of integration for the ice sheet by 
a factor of 10. This implies that for the ice sheet component the rate of CO2 change 
was only 1 p.p.m. per 10,000 years. Each experiment lasted 200,000 climate model 
years or 2 million ice-sheet model years. 

We performed several additional experiments with a slower rate of change and 
found that this rate is sufficient to detect ‘glacial inception with an accuracy of 
about 5 p.p.m. in the CO; space. Asa criterion for glacial inception we consider the 
growth of Northern Hemisphere ice sheets (excluding Greenland) of more than 
5m in sea-level equivalent (Extended Data Fig. 2). With the early model version 
it was shown’? that for a given CO) concentration the onset of glacial inception 
to the first approximation depends only on the maximum summer insolation at 
65° N. This is also supported by Fig. 3a. We used least-squares methods to fit the 
data to the logarithmic curve. The choice of a logarithmic curve is natural because 
the radiative forcing of CO) is proportional to the logarithm of CO concentration. 

The shaded area in Fig. 3a represents the uncertainty range of critical insola- 
tion +4W m”? (one standard deviation). In the vicinity of pre-industrial COz 
concentration, this uncertainty in insolation corresponds to the uncertainty in 
the CO, threshold value of +20 p.p.m. The absolute uncertainties are lower for 
low CO; and higher for high CO? concentrations. This total uncertainty originates 
from three sources: (1) uncertainties resulting from the existence of several model 
versions, all of which are consistent with our palaeoclimate constraints, (2) the fact 
that the critical insolation depends not only on total summer insolation, but, to a 
lesser degree, also on individual contributions from precession and obliquity’’, and 
(3) errors related to the methods of tracing the threshold CO) values. Although past 
glacial inceptions provide a tight constraint on a range of model parameters (Fig. 3), 
it is necessary to note that a single-model ensemble obtained by perturbation of 
model parameters may considerably underestimate the range of uncertainties com- 
pared to multi-model ensembles, owing to the lack of structural uncertainties. 
More objective estimates of uncertainties in the insolation—CO} relationship would 
therefore only be possible when a number of different Earth system models are 
available for this type of study. 

We found that our new insolation-CO), relationship differs from that in ref. 8, 
primarily in that the new insolation-CO, graph is concave, unlike the convex shape 
in ref. 8. This difference can be explained by the much faster rates of changes of 
the forcing used in ref. 8, which do not allow us to detect bifurcation points with 
sufficient accuracy. The new methodology and greatly increased computer per- 
formance allow us now to determine the critical CO value much more accurately 
to obtain a physically sound insolation—CO) relationship. 

Timing of past and future glacial inceptions. Using the insolation—-CO, rela- 
tionship and orbital parameters?” we computed the value of the critical CO2 con- 
centration every 1,000 years (Fig. 3c and Extended Data Fig. 3). If the critical CO, 
concentration is above its actual value and the system is in an interglacial state 
then glacial inception should begin. When the system is already in a glacial state 
the critical relationship is not applicable. This is because the strong nonlinearity 
and extremely long response time of the climate-cryosphere system mean that its 
dynamics depend not only on the instantaneous forcing but also on its past evolu- 
tion. Therefore our criterion for glacial inception can only be applied to interglacial 
states, that is, under conditions similar to the Holocene with respect to both global 
ice volume and temperatures. Some interglacials, such as the Holocene (MIS1), 
Eem (MIS5e), MIS9 and MIS11, are clearly defined in the various palaeoclimate 
records (Extended Data Fig. 3). The situation is less clear for MIS7 and MIS15, 
where several interglacials per one glacial cycle can be identified”*. For these two 
cases we show in Fig. 3 two different glacial inceptions. 

Code availability. Code for the ice sheet model SICOPOLIS can be accessed at 
http://www.sicopolis.net. Code for the Earth system model CLIMBER-2 model is 
not available owing to a lack of comprehensive technical description. 
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Extended Data Figure 1 | Ice sheet at 0 kyr Bp. The extension and elevation of simulated Northern Hemisphere ice sheets at the time corresponding to 
present-day (0 kyr Bp, ‘OK’) insolation are shown for constant CO? concentrations of 280 p.p.m. (a) and 240 p.p.m. (b). Experiments were performed with 


the coldest of the accepted model versions. 
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Extended Data Figure 3 | Past glacial inceptions. Past glacial inceptions, detected using the critical insolation-CO, relation, are shown in comparison 
to three different reconstructions of ice-volume variations over the past 800 kyr (refs 25, 29 and 30) and results of model simulations. 
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Plant functional traits have globally consistent 
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Phenotypic traits and their associated trade-offs have been shown 
to have globally consistent effects on individual plant physiological 
functions!-3, but how these effects scale up to influence competition, 
a key driver of community assembly in terrestrial vegetation, has 
remained unclear *. Here we use growth data from more than 
3 million trees in over 140,000 plots across the world to show how 
three key functional traits—wood density, specific leaf area and 
maximum height—consistently influence competitive interactions. 
Fast maximum growth of a species was correlated negatively with 
its wood density in all biomes, and positively with its specific leaf 
area in most biomes. Low wood density was also correlated with 
a low ability to tolerate competition and a low competitive effect 
on neighbours, while high specific leaf area was correlated with 
alow competitive effect. Thus, traits generate trade-offs between 
performance with competition versus performance without 
competition, a fundamental ingredient in the classical hypothesis 
that the coexistence of plant species is enabled via differentiation 
in their successional strategies’. Competition within species was 
stronger than between species, but an increase in trait dissimilarity 
between species had little influence in weakening competition. No 
benefit of dissimilarity was detected for specific leaf area or wood 
density, and only a weak benefit for maximum height. Our trait- 
based approach to modelling competition makes generalization 
possible across the forest ecosystems of the world and their highly 
diverse species composition. 

Phenotypic traits are considered fundamental drivers of community 
assembly and thus species diversity>®. The effects of traits on individual 
plant physiologies and functions are increasingly understood, and have 
been shown to be underpinned by well-known and globally consistent 
trade-offs!~>. For instance, traits such as wood density and specific leaf 
area capture trade-offs between the construction cost and longevity or 


strength of wood and leaf tissues”. By contrast, we still have a limited 
understanding of how such trait-based trade-offs translate into compet- 
itive interactions between species, particularly for long-lived organisms 
such as trees. Competition is a key filter through which ecological and 
evolutionary success is determined’. A long-standing hypothesis is that 
the intensity of competition decreases as two species diverge in trait 
values’ (trait dissimilarity). The few studies*!° that have explored links 
between traits and competition have shown that linkages were more 
complex than this, as particular trait values may also confer competitive 
advantage independently from trait dissimilarity?'*. This distinction 
is fundamental for species coexistence and the local mixture of traits. 
If neighbourhood competition is driven mainly by trait dissimilarity, 
this will favour a wide spread of trait values at a local scale. By contrast, 
if neighbourhood interactions are mainly driven by the competitive 
advantage associated with particular trait values, those trait values 
should be strongly selected at the local scale, with coexistence oper- 
ating at larger spatial or temporal scales®!. Empirical investigations 
have been limited so far to a few particular locations, restricting our 
ability to find general mechanisms that link traits and competition in 
the main vegetation types of the world. 

Here we quantify the links between traits and competition, 
measured as the influence of neighbouring trees on growth of a 
focal tree. Our framework is novel in two important ways: first, 
competition is analysed at an unprecedented scale covering all the 
major forest biomes on Earth (Fig. la and Extended Data Fig. 1), 
and second, the influence of traits on competition is partitioned 
among four fundamental mechanisms (Fig. 1b, c) as follows. A 
competitive advantage for trees with some trait values compared 
to others can arise by: (1) permitting faster maximum growth in 
the absence of competition’®; (2) exerting a stronger competi- 
tive effect!®!”, meaning that competitor species possessing those 
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Figure 1 | Assessing competitive interactions at global scale. 

a, Precipitation—temperature space occupied by each data set (LPP, large 
permanent plots data; NFI, national forest inventories data). For data with 
multiple plots, the range of climatic condition is represented by an ellipse 
covering 98% of the plots. Biomes are: 1, tundra; 2, taiga; 3, Mediterranean; 
4, temperate forest; 5, temperate rainforest; 6, desert; 7, tropical seasonal 
forest; and 8, tropical rainforest (see Methods for details). b, Sampled 
patches vary in the abundance of competitors from species c around 
individuals of focal species f. c, We modelled how trait values of the focal 
tree (t/), and the abundance (measured as the sum of their basal areas) and 
traits values of competitor species (t,) influenced basal area growth of the 
focal tree. Species maximum growth (red) was influenced by trait of the 
focal tree (mp +m, ty, with mp maximum growth independent of the trait). 
Reduction in growth per unit basal area of competitors (—ay,, black) was 
modelled as the sum of growth reduction independent of the trait (blue) 
by conspecific (Qpintra) and heterospecific (Qointer) competitors, the effect 
of competitor traits (f,) on their competitive effect (a), the effect of the 
focal tree's traits (t/) on its tolerance of competition (a;), and the effect of 
trait dissimilarity between the focal tree and its competitors (|t,—t,) on 
competition (a4). The parameters mo, 1, Qointras Xointer We A aNd ag are 
fitted from data using a maximum likelihood method. 


traits suppress more strongly the growth of their neighbours; or 
(3) permitting a better tolerance of competition (described as a compet- 
itive ‘response’ in ref. 16), meaning that the growth of species possess- 
ing those traits is less affected by competition from neighbours. Finally, 
(4) competition can promote trait diversification, if increasing trait dis- 
similarity between species reduces interspecific competition compared 
to intraspecific competition’. Here we show how these four mecha- 
nisms are connected to three key traits that describe plant strategies 
worldwide’. These traits are wood density (an indicator of a trade- 
off in stems between growth and strength), specific leaf area (SLA; 
an indicator of a trade-off in leaves between cheap construction cost 
and leaf longevity), and maximum height (an indicator of a trade-off 
between sustained access to light and early reproduction). We analyse 
the basal area growth (annual increase in the cross-section area of a 
tree trunk at 1.3 m height) of more than 3 million trees from over 
2,500 species, across all major forested biomes of the Earth (Fig. 1). 
Species mean trait values were extracted from local data bases and 
the global TRY data base'® (see Methods). We analysed how the basal 
area growth of each individual tree was reduced by the abundance of 
competitors in its local neighbourhood’? (measured as the sum of 
basal areas of competitors in m* ha~'), accounting for traits of both 
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the focal tree and its competitors. This analysis allowed effect sizes to 
be estimated for each of the four mechanisms outlined earlier (Fig. 1c). 

Across all biomes, the strongest driver of individual growth was the 
total abundance of neighbours, irrespective of their traits (parameters 
Qointra and Avinter in Fig. 2). Values were strongly positive, indicating that 
neighbours had competitive rather than facilitative effects. The main 
effects of traits were that some trait values led to a competitive advan- 
tage compared to others through two main mechanisms. First, traits 
of the focal species had influences on its maximum growth—that is, 
in the absence of competition—(parameter mj in Fig. 2 and Extended 
Data Table 1). The fastest growing species had low wood density and 
high SLA values, although the confidence interval on the trait effect 
intercepted zero in two out of five biomes for SLA (Fig. 2). This is in 
agreement with previous studies'>”° of adult trees reporting a strong 
link between maximum growth and wood density but a weaker link 
for SLA. Second, some trait values were associated with species having 
stronger competitive effects, or better tolerance of competition (Fig. 2 
and Extended Data Table 1). High wood density was correlated with 
better tolerance of competition from neighbours and with a stronger 
competitive effect on neighbours, whereas low SLA was correlated only 
with a stronger competitive effect. This agrees with studies reporting 
that high wood density species are more shade-tolerant'® and have 
deeper and wider crowns”!’, hence potentially higher light intercep- 
tion (further detail in Supplementary Discussion). The shorter leaf 
lifespan associated with high SLA results in lower leaf mass fraction”. 
The low competitive effect associated with high SLA species could thus 
result from a lower light interception, but few data are available on this 
link’. Maximum height was weakly negatively correlated with toler- 
ance to competition in three out of five biomes, supporting the idea 
that sub-canopy trees are more shade-tolerant”'. We found, however, 
no correlation between maximum height and competitive effect. The 
current height of an individual does have an influence on light intercep- 
tion, a key process in competition’*. But maximum height of a species 
reflects its long-term strategy, and would possibly have stronger effects 
on long-term population level competition outcomes than it did on 
short-term basal area growth”. 

After separating trait-independent differences between intraspecific 
versus interspecific competition, trait dissimilarity had little effect on 
competition between species (Fig. 2). Only dissimilarity in maximum 
height between focal and neighbouring species led to a weak, but con- 
sistent, decrease in competitive suppression of tree growth (Fig. 2). 
Mechanisms explaining this effect are poorly understood, but could 
possibly result from complementary crown architectures**”®. The aver- 
age differences in strength of interspecific versus intraspecific com- 
petition between two species—a key indicator of processes that could 
stabilize coexistence— were thus only weakly related to trait dissimi- 
larity (Extended Data Fig. 2). Trait dissimilarity effects are widely con- 
sidered to be a key mechanism by which traits affect competition’, but 
our analysis shows at global scale that trait dissimilarity effects are weak 
or absent. It remains unclear why the trait-independent competitive 
effects are higher within species than between species. Higher loads of 
shared specialized pathogens”’ could plausibly contribute. Other traits 
or combinations of traits (see ref. 12) may show stronger trait dissimi- 
larity effects, but we currently lack the trait data to capture such effects. 

Analyses allowing for different effects among biomes did not show 
any particular biome behaving consistently differently from the others 
(Fig. 2). This lack of context dependence in trait effects may seem sur- 
prising, but reinforces the idea that competition for light is important in 
most forests, and this may explain why we find consistency across such 
diverse forest types (see Supplementary Discussion for further details). 

Our global study supports the hypothesis that trait values favour- 
ing high tolerance of competition or high competitive effects also 
render species slow-growing in the absence of competition across 
all forested biomes (Fig. 3). This trait-based trade-off is a key ingre- 
dient in the classical model of successional coexistence in forests, 
in which fast-growing species are more abundant in early successional 
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associated with a slow potential growth rate but a high tolerance to 
competition and a strong competitive effect (Fig. 3). A similar pattern 
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was present, although less clear, for SLA. High SLA was correlated with 
a low competitive effect but fast maximum growth (confidence inter- 
vals not spanning zero in three biomes, Figs 2 and 3, see Extended 
Data Fig. 3 for maximum height). Given that the long-term outcomes 
of competition at the population level may be more influenced by 
tolerance of competition than by the competitive effect!®, SLA might 
be less influential in succession. 

Coordination between trait values conferring a strong competitive 
effect and trait values conferring a high tolerance of competition has 
been widely expected®"®, but rarely documented'®*?*, Only wood den- 
sity showed such coordination, as it was correlated with both compet- 
itive effect and tolerance of competition in the same direction (Fig. 2). 

The globally consistent links that we report here between traits and 
competition have considerable promise for predicting species inter- 
actions governing forest communities across different forest biomes 
and continents of the globe. Our analysis demonstrates that trait 
dissimilarity is not the major determinant of local-scale competitive 
effects on tree growth, at least for these three traits. By contrast, the 
trait-based trade-off in performance with competition versus without 
competition, reported here, could promote the coexistence of species 
with diverse traits, provided disturbances create a mosaic of succes- 
sional stages. A challenge for the future is to move beyond growth to 
analyse all key demographic rates and life-history stages, and analyse 
how traits influence competitive outcomes and stable coexistence at 
the population level. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Model and analysis. To examine the link between competition and traits, we used 
a neighbourhood modelling framework*” to model the growth of a focal tree of 
species (f) as a product of its maximum growth (determined by its traits and size) 
together with reductions due to competition from individuals growing in the local 
neighbourhood (see definition below). Specifically, we assumed a relationship of 
the form 


Ni 
Gi,f.p,s,t = Gays Pih os > af,cBi,c,p,s 


c=1 


(1) 


in which Gi¢,,,,and Dip, are the annual basal area growth (G) and diameter (D) 
at breast height of individual (i), from species (f), plot or quadrat (p) (see below), 
data set (s) and census (t). Gmax fps 8 the maximum basal area growth for species 
(f) on plot or quadrat (p), in data set (s); that is, in absence of competition, 7 
determines the rate at which growth changes with size for species (f), modelled 
with a normally distributed random effect of species €,,¢(as y= Yo + £4, in which 
€,,¢~~ (0,04)—a normal distribution of mean 0 and standard deviation o,). ag. is 
the per unit basal area effect of individuals from species (c), on growth of an indi- 
vidual in species (f). Bj,c,p,s = 0.257)... ; W; Di cee is the sum of basal area of all 
individuals competitor trees (j), of the species (c), within the local neighbourhood 
of the tree (i), in plot (p), data set (s), and census (t), where wj; is a constant based 
on neighbourhood size for tree (j), depending on the data set (see below). Note 
that B;,.,,s include all trees of species (c) in the local neighbourhood except the tree 
(i), and N; is the number of competitor species in the local neighbourhood of focal 
tree (i). Values of ay, > 0 indicate competition, whereas ag, < 0 indicates 
facilitation. 
A log-transformation of equation (1) leads to a linearized model of the form 


Ni 
log (Gip.p,st) = log (Gmaxyp,s) + 77 log(Di,p.s,t) + s — af cBi,c,p,s (2) 


c=1 


To include the effects of traits on the parameters of the growth model, we build 
on previous studies that explored the role of traits for tree performances and tree 
competition®!!, We modelled the effect of traits, one trait at a time. The effect of 
a focal species’ trait value (t/) on its maximum growth was included as: 


log (Gmax p p,,) = Mo +m ty +m, MAT + m3MAP + EGmax.f + EGmax,P + EGmaxs (3) 


Here, mp is the average maximum growth, m1 gives the effect of the focal species 
trait, mz and m; the effects of mean annual temperature (MAT) and sum of annual 
precipitation (MAP), respectively, and €g, . f> €Ginax P ANA EG, s AE Normally 
distributed random effects for species (f), plot or quadrat (p) (see below), 
and data set (s) (in which eg. ¢ ~N(0; OG max. f)> Emax, PY N(0s TG nay, p) ANA 
EGinaxs ~ N+ F Gmas,s))* 

As shown in Fig. 1, competitive parameter (a1) was modelled using an equation 
of the form: 


Af, = Qointra,f C + Qointer,f (1 — C) — at tr+ Qete + Ad |te = te| (4) 


in which Qointra and Aointerfare (respectively) intraspecific and average interspecific 
trait independent competition for the focal species (f), modelled each with a nor- 
mally distributed random effect of species (f) and normally distributed random 
effect of data set (s) (such as Qointra,f = Qointra + Eaointra,f + Eaointas i which 
Eaointra,f ~ N(0, Caointra,f ANd Ec gintias ~ N(0; Oa ointras)» and replacing intra by inter 
gives the expressions for Qojnter,f ). C is a binary variable taking the value one for 
f=c (conspecific) and zero for f# c (heterospecific), a; is the tolerance of compe- 
tition by the focal species, that is, change in competition tolerance due to traits (¢) 
of the focal tree with a normally distributed random effect of data set (s) included 
(Cans ~ N(O, oa;)). He is the competitive effect, that is, change in competition effect 
due to traits (t,) of the competitor tree with a normally distributed random effect 
of data set (s) included (€,,,; ~ M(0, oa.) and a is the effect of trait dissimilarity, 
that is, change in competition due to absolute distance between traits |f,— t with 
anormally distributed random effect of data set (s) included (E45 ~ N(0, 004): 

Our decomposition of the competition parameter (a) into trait-based processes 
builds on previous studies. In one of the first studies, Uriarte et al.8 modelled a 
as a= ay + g|ty— t-|. Then, Kunstler et al. used two different models: a= ag+ 
aa|tp—te| or = a9 + a,|tf— t,|. Finally, Lasky et al.!' developed a single model 
including multiple processes as a= ag + t¢+ an(tp— te) + Qa tp— t,|. Here we 
extended the approach of this most recent study!! by splitting an(tf— t:) into 
ayty+ et: (which is equivalent to the hierarchical distance if a;= —a,) and includ- 
ing two qo, one for intraspecific and one for interspecific competition. 

To simplify the estimation, equation (4) was combined with the basal area 
of each competing species to relate the parameters directly to the community 
weighted means of the different trait variables as: 


N 

Se cf Bi,cp,s = 0,f,intra Bis + 0,f,inter Binet — Os tr Bitot +a. Bir, + ag Bi \te—t¢| (5) 

c=1 
Where Bj net = >, f B;,- is the sum of basal area of heterospecific competitors 
(het), Bitot= Big+ Binet is the sum of basal area of all competitors, 
Bit,= es t, x Bi, and B;, tet f= Sle — t,| x Bj,-. Njis the number of spe- 
cies in the local neighbourhood of the tree (i) (note that the indices p and s for plot 
and data set are not shown here for sake of simplicity). 

Estimating separate ao for intra and interspecific competition allowed us to 
account for trait-independent differences in interactions with conspecifics and 
heterospecifics. We also explored a simpler version of the model where trait- 
independent competitive effects were pooled (that is, there was a single value for 
Qo), as previous studies have generally not made this distinction, using the fol- 
lowing equation: 


Af, = Qo,f — artfp+ Me te+ aad |te — ty| (6) 


In this alternative model, any differences between intra and interspecific com- 
petition do enter into trait dissimilarity effects, with a trait dissimilarity of zero 
attached to them. This may lead to an overestimation of the trait dissimilarity effect. 
Results for this model are presented in in Extended Data Fig. 4. 

Equations (2)-(4) were then fitted to empirical estimates of growth based on 
change in diameter between census t and t+ 1 (respectively at year y; and y+ 1) 
given by 


Gif,p,s.t = 0.257 (Dip nst+1 = Dip pst)/ Oru —y) (7) 


To estimate standardised coefficients (one type of standardised effect size)*), 

response and explanatory variables were standardized (divided by their standard 
deviations) before analysis. Trait and diameter were also centred to facilitate con- 
vergence. The models were fitted using the Imer routine in the lme4 package* 
in the R statistical environment**. We fitted two versions of each model. In the 
first version parameters mo, 1, Q0, Ap, Aj, Aq Were estimated as constant across 
all biomes. In the second version, we allowed different fixed estimates of these 
parameters for each biome. This enabled us to explore variation among biomes. 
Because some biomes had few observations, we merged those with biomes with 
similar climates. Tundra was merged with taiga, tropical rainforest and tropical 
seasonal forest were merged into tropical forest, and deserts were not included in 
this final analysis as too few plots were available. To evaluate whether our results 
were robust to the random effect structure we also explored a model with a random 
effect attached to parameters both for the data set and for a local ecoregion using 
the Képpen-Geiger ecoregion** (see Supplementary Results). 
Estimating the effect of traits on the average differences between intra and 
interspecific competition. Differences between inter and intraspecific competition 
have long been considered key to community assembly and species coexist- 
ence!*#>-38, Our estimated growth model allowed us to estimate the average inter 
and intraspecific competition from trait-independent and trait-dependent pro- 
cesses. For any combination of two trait values f; and #, we can predict the inter- 
specific (a, and ;,,1,) and intraspecific (a4,,r,and a,,,,;) competition parameters 
for a typical species by leaving out the random species effects in equation (4). We 
can then estimate the average differences between interspecific and intraspecific 
competition over all trait values combinations using the following expression: 


(t,t) = Qt;,t)) + (Qe5,t; = Q,,t;) (8) 
2 


Substituting in from equation (4) (leaving out the species random effect) this sim- 
plifies as: 


Qointer — Qointra + alt; — til (9) 


Thus, the average differences between inter and intraspecific competition are 

affected only by the difference between Qointra and Qpinter and by trait dissimilarity 
via aq (see Extended Data Fig. 2 for the results). 
Growth data. Our main objective was to collate data sets spanning the dominant 
forest biomes of the world. Data sets were included if they allowed both growth 
of individual trees and the local abundance of competitors to be estimated, and if 
they had good (>40%) coverage for at least one of the traits of interest (SLA, wood 
density and maximum height). 

The data sets collated fell into two broad categories: (1) national forest inven- 
tories (NFI), in which trees above a given diameter were sampled in a network of 
small plots (often on a regular grid) covering the country (references for NFI data 
used*?-48); (2) large permanent plots (LPP) ranging in size from 0.5 to 50 ha, in 
which the x-y coordinates of all trees above a given diameter were recorded (ref- 
erences for LPP data used refs 49-56). LPP were mostly located in tropical regions. 
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The minimum diameter of recorded trees varied among sites from 1 to 12 cm. To 
allow comparison between data sets, we restricted our analysis to trees greater 
than 10 cm. Moreover, we excluded from the analysis any plots with harvesting 
during the growth measurement period, that were identified as plantations, or that 
overlapped a forest edge. Finally, we randomly selected only two consecutive census 
dates per plot or quadrat to avoid having to account for repeated measurements 
(less than one-third of the data had repeated measurements). Because human 
and natural disturbances are present in all of these forests (see Supplementary 
Information), they probably all experience successional dynamics (as indicated 
by the forest age distribution available in some of these sites in Supplementary 
Information). See Supplementary Information and Extended Data Table 2 for more 
details on individual data sets. 

Basal area growth was estimated from diameter measurements recorded 
between the two censuses. For the French NFI, these data were obtained from 
short tree cores. For all other data sets, diameter at breast height (D) of each indi- 
vidual was recorded at multiple census dates. We excluded trees (1) with extreme 
positive or negative diameter growth measurements, following criteria developed 
at the BCI site*’ (see the R package CTFS R), (2) that were palms or tree ferns, or 
(3) that were measured at different heights in two consecutive censuses. 

For each individual tree, we estimated the local abundance of competitor species 
as the sum of basal area for all individuals >10 cm diameter within a specified 
neighbourhood. For LPPs, we defined the neighbourhood as being a circle with a 
15m radius. This value was selected based on previous studies showing the max- 
imum radius of interaction to lie in the range 10-20 m (refs 8, 19). To avoid edge 
effects, we also excluded trees less than 15m from the edge of a plot. To account 
for variation of abiotic conditions within the LPPs, we divided plots into regularly 
spaced 20 x 20 m quadrats and included a random quadrat effect in the model 
(see above). 

For NFI data coordinates of individual trees within plots were generally not 
available, thus neighbourhoods were defined based on plot size. In the NFI from 
the United States, four sub-plots of 7.35 m located within 20 m of one another 
were measured. We grouped these sub-plots to give a single estimate of the local 
competitor abundance. Thus, the neighbourhoods used in the competition analysis 
ranged in size from 10 to 25 m radius, with most plots from 10 to 15 m radius. We 
included variation in neighbourhood size in the constant w; to compute competitor 
basal area in m’ ha. 

We extracted MAT and MAP from the WorldClim data base*” using the plot 

latitude and longitude (see Extended Data Fig. 1 for plot locations). MAT and MAP 
data were then used to classify plots into biomes, using the diagram provided by 
ref. 58 (modified from ref. 59). 
Traits. Data on species functional traits were extracted from existing sources. We 
focused on wood density, species SLA and maximum height, because these traits 
have previously been related to competitive interactions and are available for large 
numbers of species®!!153° (see Extended Data Tables 3 and 4 for trait coverage 
and their correlations). Where available, we used data collected locally (references 
for the local trait data used in this analysis include refs 15, 51, 60-62); otherwise we 
sourced data from the TRY trait data base!® (references for the data extracted from 
the TRY database used in this analysis include refs 2, 3, 15, 63-130). Local data 
were available for most tropical sites and species (see Supplementary Information). 
Several of the NFI data sets also provided tree height measurements, from which 
we computed a species’ maximum height as the 99% quantile of observed values 
(for France, USA, Spain and Switzerland). For Sweden, we used the estimate from 
the French data set and for Canada we used the estimate from the USA data set. 
Otherwise, we extracted height measurements from the TRY database. We were 
not able to account for trait variability within species. 

For each focal tree, our approach required us to also account for the traits of all 
competitors present in the neighbourhood. Most of our plots had good coverage of 
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Extended Data Figure 1 | Map of the plot locations of all data sets analysed. LPP plots are represented with a large points and NFI plots with small 
points (the Panama data set comprises both a 50 ha plot and a network of 1 ha plots). The world map is from the R package rworldmap”*! using Natural 
Earth data. 
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Extended Data Figure 2 | Average difference between interspecific and 
intraspecific competition predicted with estimates of trait-independent 
and trait-dependent processes influencing competition for models 
fitted for each trait. a—c, Models were fitted for wood density (a), SLA 


A Specific leaf area (mm? mg”) 
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A Maximum height (m) 


(b) or maximum height (c). The average differences between interspecific 
and intraspecific competition are influenced by Qointras Qointer ad Ag 
coefficients (see Methods for details). Negative values indicate that 
intraspecific competition is stronger than interspecific competition. 
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Extended Data Figure 3 | Variation of trait-independent inter 

and intraspecific competition, trait dissimilarity (|t;—t,| x aa), 
competitive effect (t, X a), tolerance to competition (ty X a,) and 
maximum growth (t; X m ) with wood density, SLA and maximum 
height. a~o, Wood density (a-e), SLA (f-j) and maximum height (k-o). 


Trait varied from their quantile at 5% to their quantile at 95%. The shaded 
area represents the 95% confidence interval of the prediction (including 
uncertainty associated with ap Or M9). Qointra ANd Apinter Which do not vary 
with traits, are represented with their associated confidence intervals. 
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Extended Data Figure 4 | Trait-dependent and trait-independent and maximum height. a, Wood density. b, SLA. c, Maximum height. See 
effects on maximum growth and competition across the globe and Fig. 2 in the main text for parameters description, and see Fig. 1a in the 
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between intra and interspecific competition for wood density, SLA 
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Extended Data Table 1 | Standardized coefficient estimates from models fitted for each trait 


Wood density 


SLA Maximum height 


Qo intra 
QO inter 


Qe 


R?* 

R?* 

A AIC 

A AIC no trait 


0.016 (0.127) 
0.418 (0.011) 
-0.149 (0.036) 
0.111 (0.003) 
0.053 (0.002) 
0.24 (0.037) 
0.086 (0.022) 
0.034 (0.016) 
0.069 (0.021) 
0 (0.009) 
0.1393 
0.7297 

0 

2469 


-0.087 (0.132) 
0.401 (0.012) 
0.119 (0.057) 
0.093 (0.003) 
0.056 (0.003) 
0.213 (0.052) 
0.071 (0.025) 
-0.083 (0.023) 
-0.009 (0.033) 
-0.018 (0.015) 
0.1637 
0.7593 

0 

1651 


0.084 (0.089) 
0.42 (0.01) 
0.063 (0.04) 
0.081 (0.002) 
0.048 (0.002) 
0.194 (0.046) 
0.094 (0.024) 
0.017 (0.026) 
-0.071 (0.032) 
-0.017 (0.008) 
0.1429 
0.7166 

0 

2748 


LETTER 


Estimates and their standard error (in brackets) estimated for each trait, R* of models and AAIC of the model and of a model with no trait effect. See Methods for explanation of parameters 
*We report the conditional and marginal R of the models using the methods of ref. 132, modified by ref. 133. AAIC is the difference in the Akaike’s information criterion (AIC; as denned by ref. 134) 
between the model and the best model (lowest AIC). The best-fitting model was identified as the one with a AAIC of 0. AAIC greater than 10 shows strong support for the best model! 
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Extended Data Table 2 | Trees data description 


set #oftrees #ofspecies #ofplots/quadrats %ofangiosperm % of evergreen 
Sweden 202480 26 22552 27.0 73.0 
New Zealand 53775 117 1415 94.0 99.1 
US 1370541 492 59840 63.3 37.2 
Canada 495008 75 14983 34.4 64.8 
Australia 906 101 63 99.9 92.4 
France 184316 127 17611 74.1 28.5 
Switzerland 28286 60 2597 36.4 55.2 
Spain 418805 122 36462 34.7 81.6 
Panama 27089 237 2033 99.8 77.7 
French Guiana 46360 712 2157 100.0 83.5 
Japan 4658 139 318 72.8 70.0 
Taiwan 14701 72 623 92.0 75.3 
Puerto Rico 14011 82 399 100.0 99.0 
Central African Republic 17638 204 989 99.5 72.4 


For each site, the number of individual trees, species and plots in NFI data and quadrats in LPP data, and the percentage of angiosperm and evergreen species are shown. 
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Extended Data Table 3 | Traits data description 


set % cover SLA % cover Wood density % cover Max height 
Sweden 99.7 99.6 97.9 
New Zealand 99.8 99.5 99.9 
US 91.3 94.4 100.0 
Canada 99.4 99.4 99.9 
Australia 0.0 99.2 100.0 
France 99.2 98.9 99.9 
Switzerland 96.7 95.1 99.7 
Spain 97.3 98.9 100.0 
Panama 93.0 93.1 95.4 
French Guiana 73.3 73.5 63.5 
Japan 99.7 99.7 100.0 
Taiwan 99.9 99.3 95.8 
Puerto Rico 99.3 99.3 99.3 
Central African Republic 40.3 47.1 0.0 


The coverage in each site is given with the percentage of species with species level trait data. 
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Extended Data Table 4 | Species traits pairwise correlations 


Wood density SLA Max height 


Wood density 1 0.18 -0.04 
SLA 1 0.24 
Max height 1.00 


Pearson’s R correlations for the three traits. 
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Earliest hominin occupation of Sulawesi, Indonesia 


Gerrit D. van den Bergh! , Bo Li!, Adam Brumm?“, Rainer Griin’, Dida Yurnaldi>®, Mark W. Moore’, Iwan Kurniawan°, 
Ruly Setiawan!°, Fachroel Aziz’, Richard G. Roberts!, Suyono®, Michael Storey’, Erick Setiabudi> & Michael J. Morwood! 


Sulawesi is the largest and oldest island within Wallacea, a vast zone 
of oceanic islands separating continental Asia from the Pleistocene 
landmass of Australia and Papua (Sahul). By one million years ago 
an unknown hominin lineage had colonized Flores immediately to 
the south’, and by about 50 thousand years ago, modern humans 
(Homo sapiens) had crossed to Sahul”*. On the basis of position, 
oceanic currents and biogeographical context, Sulawesi probably 
played a pivotal part in these dispersals*. Uranium-series dating of 
speleothem deposits associated with rock art in the limestone karst 
region of Maros in southwest Sulawesi has revealed that humans 
were living on the island at least 40 thousand years ago (ref. 5). Here 
we report new excavations at Talepu in the Walanae Basin northeast 
of Maros, where in situ stone artefacts associated with fossil remains 
of megafauna (Bubalus sp., Stegodon and Celebochoerus) have been 
recovered from stratified deposits that accumulated from before 
200 thousand years ago until about 100 thousand years ago. Our 
findings suggest that Sulawesi, like Flores, was host to a long- 
established population of archaic hominins, the ancestral origins 
and taxonomic status of which remain elusive. 

In the late 1940s the discovery of ‘Palaeolithic’ stone artefacts in 
association with Pleistocene fossil fauna in the Walanae Basin of south 
Sulawesi® (Fig. 1) led to considerable speculation about the time depth 
of human occupation of the island”*. The lithic assemblages comprised 
cores, choppers and flakes (the ‘Cabenge Industry’), and derived from 
undated surface collections along the eastern side of the Walanae 
River®®, which follows the Walanae Depression, an elongated north- 
south-trending fault-bounded basin (Extended Data Fig. 1). Fossils 
of several now-extinct species, including two pygmy proboscideans, 
a giant tortoise and a large endemic suid, Celebochoerus, were recov- 
ered from the same unstratified contexts”'° and excavations at various 
sites!’ Despite protracted investigations, the stratigraphic context and 
time range of the ‘Cabenge Industry’ remained unresolved because of 
a lack of in situ stone artefacts''. 

To clarify these issues we conducted surveys in the Cabenge area 
between 2007 and 2012, leading to the discovery of four new sites with 
in situ stone artefacts in their stratigraphic context. At Talepu, one of the 
newly discovered sites, we undertook deep-trench excavations. The site 
is 3km southeast of Cabenge and 13km downstream from where the 
Walanae River leaves its confining valley and enters a widening, actively 
subsiding floodplain towards the north (Extended Data Fig. 1). During 
the Pleistocene, east-west compression and wrench faulting along the 
Walanae fault zone resulted in uplift of the Sengkang anticline and the 
southern part of the Walanae Depression”. In these uplifted areas 
the folded Pliocene-Pleistocene sedimentary sequences of the Walanae 
Formation are now exposed’. In the northern part of the Walanae 
Depression, compressional down-folding facilitated accumulation 
of fluvio-lacustrine sediments from the Pleistocene to recent times. 
The Talepu site (4° 22’ 06.5” S, 119° 59’ 01.7” E) is near the hinge line 


between the uplifted southern part and the subsiding northern part of 
the Walanae Depression. 

Our excavations focused on the northernmost hill of an elongated 
ridge near Talepu village, ~600 m west of the Walanae River (Extended 
Data Fig. 1d). The Talepu Hill summit lies 32 m above sea level and 
18m above the adjacent Walanae River floodplain. Deposits exposed 
along this ridge comprise a coarsening-upward sequence of sub- 
horizontal fluvio-estuarine sand and silt layers overlain by alluvial 
cobble gravels (Extended Data Fig. 2). Two deep excavations were 
undertaken at Talepu (trenches T2 and T4) to provide a combined 
18.7 m long stratigraphic section exposing five main sedimentary units: 
units A-E in descending order of depth (Fig. 2). 

These excavations revealed the first evidence of in situ stone artefacts 
in securely stratified and dated contexts within the Walanae Basin. The 
T2 excavation yielded 270 stone artefacts between the surface and 4.2m 
depth (Fig. 3a-i, Extended Data Fig. 3b and Supplementary Table 1) 
which are associated with unit A’s high-energy fluvial gravel deposits. 
Hence most are water-rolled to various degrees, although 21% are in 
relatively fresh condition. The main source of raw material is coarse- to 
medium-grained silicified limestone cobbles measuring up to 130 mm 
in diameter. Most are medium- to large-sized flakes (Supplementary 
Table 2), with cores comprising 13% of the assemblage. Cores were 
reduced by hard-hammer blows to one face (42%) or bifacially (58%) 
from unprepared striking platforms. Core reduction was not intensive 
although seven cobbles were rotated and subsequent reduction cre- 
ated multiplatform cores. Flakes struck from the cobbles were them- 
selves reduced to one face (60%) or bifacially (40%). Although there 
is patterning in the flaking techniques, there is little evidence that the 
stoneworkers were creating tools of specific form; rather, stone-flaking 
produced sharp-edged flakes for use or as a source for additional flakes. 
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Figure 1 | Sunda, Sahul and Wallacea. Wallacea has two major 
biogeographical boundaries: Wallace’s Line to the west and Lydekker’s Line 
to the east. Exposed land during periods of low sea level (— 120 m) 

is lightly shaded. Talepu Area indicated by an arrow. 
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Figure 2 | Talepu excavations T2 and T4. a, Site map showing positions 
of T2 and T4. Dotted line indicates profile shown in d; stratigraphy, fossil 
and stone artefact occurrences, dating sampling horizons and ages for 
excavations T2 (b) and T4 (c). Layer key: unit A, conglomerate interval 
with seven distinct layers: A;, black topsoil; Az, sandy pebbly clay; 

A3, pebbly sand; Ay and As, coarse pebbly sand; Ag, silty clay; Az, gravel; 
unit B, sandy unit: B), well-sorted, laminated silty sand; By, medium- to 
fine-grained sand; B;, well-sorted sand with pebbles near erosive base; 
unit C, fine-grained sediments: C), mottled silty clay; Co, silty clay; 


From T4 we recovered 41 stone artefacts from the topsoil and 
colluvium down to a depth of 120 cm. However, four in situ silici- 
fied limestone artefacts were in exposed older strata within the 
silt of sub-unit E, (Fig. 2c), and provide the stratigraphically ear- 
liest evidence for human activity at Talepu. Two are unmodified 
flakes (2.2-2.4 m depth) (Fig. 3l1-m) and two are angular scat- 
ter fragments (3.0-3.1 m depth) (Fig. 3j-k), probably created 
in percussion flaking. The latter are made of a distinctive mot- 
tled silicified limestone and appear to have been removed from 
the same core. The artefacts bear no evidence of water transport; 
indeed, unit E did not yield any clasts indicative of high-energy 
water flow. 

Only one identifiable fossil was found in T2: a bovid lower molar 
fragment (4m depth) (Fig. 3t) that falls just above the size range of the 
extant lowland anoa, Bubalus depressicornis (Extended Data Fig. 3e). 
T4 yielded eight Celebochoerus dental elements (for example, a lower 
canine; Fig. 30) and three unidentifiable bone fragments, from the silty 
interval of sub-unit E, (between 3.1-4.0 m below the surface) and just 


C3, mottled silty clay, plant remains; Cy, laminated silty clay with sandy 
streaks (with sandy intrusion in east baulk); unit D, coarse-grained 
interval: D,, well-sorted, mottled pebbly sand; D2, very coarse cross- 
bedded sand with rip clasts and mud drapes; D3, clast-supported gravel 
with rip clasts; D4, medium- to fine-grained cross-bedded sand alternating 
with mud drapes; unit E, fine-grained sediments: E), clayey silt; Eo, 
mottled silts; E3, clayey silts. Colluvial unit X unconformably overlies 

unit D. Vertical distribution of in situ artefacts in T4 plotted as number of 
artefacts per 10cm interval. 


beneath the lowest stone artefacts. At least some of these fossil remains 
can be ascribed to one individual (Fig. 3q). A Stegodon milk molar 
fragment (1.9-2.0 m depth) (Fig. 3r) and a dermal scute of a crocodile 
(3.9-4.0 m) were also recovered. 

To constrain the age of the Talepu deposits, we obtained uranium- 
series ages for teeth and bones excavated from sub-unit E using laser 
ablation inductively coupled plasma mass spectrometry (LA-ICP-MS) 
methods! (see Methods: uranium-series dating). Sequential laser spot 
analyses were undertaken on cross-sections of eight Celebochoerus fos- 
sils found between 0.2 and 0.5m below the deepest stone artefacts in the 
same silty unit (Extended Data Fig. 4a-i). Data sets were combined for 
each sample and a single age estimate was calculated using the diffusion- 
absorption-decay model'°. Most of the age results have infinite positive 
error bounds, so it was only possible to calculate minimum ages!*. 
The combined uranium-series results (Supplementary Table 3) indi- 
cate that the fossil samples are older than ~200 thousand years (kyr). 
Palaeomagnetic samples from silty layers of units A, C and E have nor- 
mal magnetic polarities at all sampled levels (Extended Data Fig. 4j 
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Figure 3 | Finds from Talepu excavations 2 (T2) and 4/4B (T4/T4B). All 
artefacts shown are silicified limestone. Flakes: a, T2-A7, 3.7-3.8 m; b, T2- 
Aj, 0.4-0.5 m; c, T2-Az, 3.6-3.7 m; d, T2-A>, 0.9m; e, T2-Aj, 0.2-0.3 m; 

f, T2-A>, 0.8-0.9 m; j, T4B-E>, 3.0-3.1 m; k, same material as j, but does 
not refit, T4B-E, 3.0-3.1 m; 1, T4-E, 2.38 m; m, T4-E), 2.2-2.4 m. Cores: 
g, radial core (T2-Aj, 0.1-0.3 m); h, core (T2-Az, 3.6-3.7 m); i, core (T2-Az, 
3.6-3.7 m). Fossils (“TLP’ numbers refer to the Bandung Geology Museum 
fossil collection numbers from Talepu): n, upper left incisor, Celebochoerus 
sp. (TLP10-F8*, T4-E,, 3.2-3.4m); 0, lower left canine, Celebochoerus sp. 
(TLP10-F1*, T4-Ep, 3.3 m); p, upper right third premolar, Celebochoerus 
sp. (TLP10-F5, T4-E», 3.3-3.4m); q, upper right (TLP10-F3*) and upper 
left third molar (TLP10-F4*), Celebochoerus sp. (T4-E2, 3.3-3.4m); 

r, Stegodon molar ridge fragment (D, dentine; OE, outer enamel; IE, inner 
enamel; OE and IE equal thickness (TLP12-F3, T4B-E), 1.9-2 m); 

s, upper left fourth premolar, Celebochoerus sp. (TLP10-F2*, T4-Es, 
3.2-3.3 m); t, lower left third molar fragment, Bovidae, cf. Bubalus sp. 
(TLP09-F2, T2-Az, 4-4.1 m). *Fossils used for uranium-series dating. 
Scale bars: a—i, 10 mm; j-t, 20mm. 


and 5).Taken together with the uranium-series results, the fossils are 
therefore >200kyr and <780 kyr in age. 

To constrain the age of the artefacts, a multiple-elevated-temperature 
post-infrared infrared stimulated luminescence (MET-pIRIR) dating 
procedure'®’” was applied to potassium-rich feldspar grains extracted 
from five sediment samples spanning the entire sequence. The four 
samples analysed from T2 have ages in stratigraphic order 103 +9 kyr 
at 3m depth to 156 + 19 kyr at 10 m depth (Fig. 2, Methods: Optical 
Dating, and Supplementary Table 4). These results suggest that 
the Talepu cultural sequence ends at ~100 kyr, or possibly earlier 
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(see Methods). The sediments dated to 156 + 19 kyr were deposited 
near the top of unit D, which overlies the sedimentary layer (unit E) 
from which the deepest artefacts were excavated (more than 3m 
below). The oldest securely dated evidence for stone artefacts at Talepu 
is, therefore, 118kyr to 194kyr in age at the 95% confidence interval 
(20), although human occupation of the site clearly occurred ear- 
lier given the recovery of artefacts from greater stratigraphic depths 
(Fig. 2c). Lastly, a sample taken at a depth of 8 m in the lower trench 
(T4) yielded a minimum age of ~195 kyr. This age estimate is strati- 
graphically consistent with the MET-pIRIR ages for T2 and with the 
minimum uranium-series ages of ~200 kyr for the T4 fossil remains 
from sub-unit E). 

From our Talepu excavation results it is now possible to conclude 
that the initial peopling of Sulawesi took place at least 118 thousand 
years ago (ka). The identity of these early inhabitants is of consider- 
able interest given previous assumptions that Sulawesi was only ever 
colonized by H. sapiens, and currently thought to have arrived in the 
region by ~50ka (refs 2, 3, 18-20). The earliest H. sapiens skeletal 
remains from island Southeast Asia are ~45 kyr old?!”*; however, 
modern human fossils dating to ~120ka occur in the Levant”, and 
possibly at a similar time in Southeast Asia?*. Although controversial, 
it is conceivable that H. sapiens dispersed soon after their emergence 
in Africa, spread to the easternmost tip of continental Asia (Sunda) 
and crossed to Wallacea by ~120 ka. However, early hominins had 
already reached the more remote and far smaller island of Flores by 
1 million years ago (ref. 1), perhaps by accidental drifting on tsu- 
nami debris*. It is therefore also conceivable that the first people on 
Sulawesi could have arrived in a similar manner at an equivalent, 
earlier or later time. 

Our findings at Talepu attest to the presence of early tool-makers 
on Sulawesi by the late Middle Pleistocene, but the absence of Pleistocene 
human fossils on the island precludes a definitive answer as to which 
hominin species was first to make landfall. With regards to potential 
island colonizers, there are at least three candidates in the region: the 
known and inferred distributions of H. floresiensis on Flores (~190ka 
(ref. 25) or earlier'), H. erectus on the southern margin of Sunda 
(present-day Java) (~1.5 million years ago to ~140 ka (refs 26, 27)), 
and “Denisovans, whose geographic range may have extended into 
Wallacea”®. Considering the predominantly southerly flowing currents 
of the Indonesian through-flow”’, we speculate that the most likely 
points of origin for the Sulawesi colonizers are Borneo to the west (part 
of mainland Asia during periods of low sea level) and the Philippines to 
the north* (the northern extremity of Wallacea), with the implication 
being that other islands in the region harbour undiscovered records of 
archaic hominins. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 

Excavation methods. In 2007 and 2008 we undertook three (T1-T3) 1m x 2m 
test excavations at Talepu Hill, where large numbers of stone artefacts were found 
scattered on the surface with loose gravel. The summit of Talepu Hill (4° 22’ 06.5’ 
S; 119° 59! 01.7” E) lies 36 m above sea level and 18 m above the floodplain of the 
Walanae River, which flows 600 m to the east (Extended Data Fig. 1). Geological 
outcrop conditions are very poor, and thick tropical soils cover the underlying 
geological formations. The three test excavations near the summit of Talepu Hill 
proved the occurrence of in situ stone artefacts down to a depth of at least 1.8m, in 
heavily weathered conglomerate lenses and sandy silt layers. The same gravel unit 
occurs on other hilltops to the west and southwest. At Bulu Palece, 850m west of 
Talepu Hill, which is the highest hilltop in the vicinity with an elevation of 51m 
(see Extended Data Fig. 2), the gravel is at least 13 m thick, but at Talepu Hill only 
a basal interval of 4.3 m thickness remains. 

In October 2009, T2 was taken down to 7 m below surface (Extended Data 
Fig. 2b), at which depth the excavation area was reduced to a 1m x 1m square 
and taken down further to a maximum depth of 10 m. To ensure that this deep- 
trench operation was undertaken safely, we installed timber shoring as the work 
progressed (Extended Data Fig. 2c). A new east-west oriented, 1m x 9m trench 
(T4) was excavated at the base of the Talepu Hill, 40 m east of T2. This trench 
reached a maximum depth of 2 m, revealing the lateral development of the stra- 
tigraphy near the base of the hill (Fig. 2 and Extended Data Fig. 2d). Deposits 
were removed in 10cm spits within stratigraphic units. Stone artefacts and fossils 
found by the excavators were bagged and labelled immediately; all other deposits 
were dry sieved with 5 mm mesh to separate out clasts, including stone artefacts. 
Pebbles from each spit were weighed; and composition analysis was undertaken 
on clasts from a representative sample from six spits: average maximum clast 
diameter was recorded by measuring the longest diameter of the ten largest clasts 
per spit (Extended Data Fig. 3). Bulk samples of stratigraphic units were taken for 
sediment and pollen analyses. 

In October 2010, the excavations at Talepu were continued. A 1m x 2m area at 
the east end of T4 was excavated to a depth of 6.20 m below the surface, thus pro- 
viding an additional 6 m stratigraphically below the section covered by excavation 
2 in 2009. The T4 deposits were removed in 20 cm spits within stratigraphic units. 
After the excavation of an in situ stone artefact (specimen S-TLP10-1, a flake from 
sub-unit E, at a depth of 2.38 m below the surface) and fossils of Celebochoerus, it 
was decided to wet-sieve all the excavated sediments with 3mm mesh to separate 
out stones and other clasts, including stone artefacts. Wet-sieving of the silty clay 
deposits from the interval between 2 and 2.4m depth yielded one more stone 
artefact (S-TLP10-2; Fig. 3m) and two possible stone artefacts (S-TLP10-3 and 
S-TLP10-4). Magnetic susceptibility measurements were taken from the excavation 
profile at 1 cm intervals with a Bartington MS-2 device, to examine the presence 
of cryptic tephra layers suitable for dating. A sample for “°Ar/*?Ar dating was 
taken at 2.5 m below the ground surface from an interval with elevated magnetic 
susceptibility values. 

In October 2012, backfill of T2 and T4 was removed. T4 was enlarged with 
a 1 m x 2m extension (T4-B), and both T2 and T4/4-B were taken further 
down with an additional 2m and 2.1 m, respectively, to allow for sampling for 
palaeomagnetic and optical dating methods. In T4-B two more stone artefacts 
(Fig. 3j-k), originating from spit 31 (depth 3.0-3.1 m depth below ground level), 
were recovered on the sieves. 

Stone artefacts were analysed following the definitions and methods in ref. 31. 

The analysis focused on stone-flaking techniques, sequences of reduction, and sizes 
of stone-flaking products and by-products (Supplementary Tables 1 and 2). The 
stone artefacts are stored at the Geology Museum in Bandung. 
Uranium-series dating. The details for laser ablation uranium-series analysis of 
skeletal materials were recently summarized!*, Uranium-series analyses provide 
insights into when uranium migrates into a bone or tooth. This may happen a 
short time after the burial of the skeletal element or some significant time span 
later. There may also be later uranium-overprints that are difficult to recognize. 
As such, apparent uranium-series results from faunal remains have generally to be 
regarded as minimum age estimates. It is very difficult or impossible to evaluate by 
how much the uranium-series results underestimate the correct age of the sample. 
Details of the instrumentation, analytical procedures and data evaluation have 
been modified from those described in detail elsewhere!*”. All isotope ratios 
refer to activity ratios. 

Sequential laser spot analyses were undertaken on cross sections of eight 
Celebochoerus fossils from the T4 excavation at Talepu. They comprised fragments 
of six teeth and two bones from sub-unit E, found 10-50 cm below the lowest 
stone artefacts in the same silt layer. Of one fossil (TLP10-1, a Celebochoerus lower 
canine), two subsamples were analysed (a and b). Each fossil specimen was cut 
transversely using a dentist drill with a diamond saw blade (Extended Data Fig. 4). 


Four or five samples were then mounted together into aluminium cups, aligning 
the cross-sections with the outer rim of the sample holder, which later positioned 
the samples on the focal plane of the laser. Uranium-series isotopes were measured 
using the laser ablation multicollector (MC)-ICP-MS system at The Australian 
National University’s (ANU) Research School of Earth Sciences. It consists of a 
Finnigan MAT Neptune MC-ICP-MS equipped with multiple Faraday cups. At 
the time of measurement, the mass spectrometer had only one ion counter. This 
necessitated two sequential sets of measurements along parallel tracks, one for 
°30Th and a second for 7*U. The ion counter was set either to masses 230.1 or 
234.1 while the Faraday cups measured the masses 232, 235 and 238. Samples were 
ablated with a Lambda Physik LPFPro ArF excimer (A= 193 nm) laser coupled to 
the Neptune through an ANU-designed Helex ablation cell. 

The samples were initially cleaned for 10s with the laser spot size set to 265 1m 
followed by a 50s analysis run with a 205 1m spot size using a 5 Hz pulse rate. 
Analyses were performed at regular intervals along traverses, all starting from 
the exterior surface (Extended Data Fig. 4a-i). The data sets of each transect were 
bracketed between reference standard analyses to correct for instrument drift. 

Semi-quantitative analysis of uranium and thorium concentrations were derived 
from repeated measurements of the SRM NIST-610 glass (uranium =461.5y.gg ‘5 
thorium = 457.21g g-'), and uranium-isotope ratios from repeated measurements 
of rhinoceros tooth dentine from Hexian (sample 1118)**. Age estimates combin- 
ing all measurements on a specimen were calculated using the iDAD program", 
assuming diffusion from both surfaces for the bones (TLP10-6 and 7) and roots of 
the teeth (Extended Data Fig. 4a-f, h, i) and directional diffusion from the central 
pulp cavity into the dentine and covering enamel for TLP10-9 (Extended Data 
Fig. 4g). The enamel data of the enamel samples were omitted as enamel has a 
different diffusion rate. Generally, results with elemental U/Th <300 are rejected, 
as these are associated with detrital contamination. However, this applied only to 
a single measurement. The finite ages are given with 2c error bands; the infinite 
results only refer to the lower bound of the 2c confidence interval (Supplementary 
Table 3). None of the samples showed any indication for uranium leaching, which is 
either expressed by sections with ?°Th/**U >> 734U/?38U or increasing 7°°Th/34U 
ratios towards the surface in conjunction with decreasing uranium-concentrations. 

Five samples had infinite positive error bounds and it was thus only possible to 

calculate minimum ages. It can be seen that the uranium-series results may change 
over small distances within a sample. The first data set of TLP10-1 yielded a finite 
result of 161 + 15 kyr while the second set yielded a minimum age of >255 kyr. As 
mentioned above, all uranium-series results, whether they are finite of infinite, have 
to be regarded as minimum age estimates. If the faunal elements present a single 
population, the uranium-series results indicate that the Talepu samples are most 
probably older than ~350 kyr, but certainly older than ~200 kyr (Supplementary 
Table 3). The large errors do not allow us to further constrain the age. 
Infrared stimulated luminescence dating of feldspar grains. Optical dating pro- 
vides an estimate of the time since grains of quartz or potassium-rich feldspar were 
last exposed to sunlight**>”, The burial age is estimated by dividing the equiva- 
lent dose (D,, a measure of the radiation energy absorbed by grains during their 
period of burial) by the environmental dose rate (the rate of supply of ionizing 
radiation to the grains over the same period). D, is determined from the laboratory 
measurements of the optically stimulated luminescence (OSL) from quartz or the 
infrared stimulated luminescence (IRSL) from potassium (K)-feldspar, and the 
dose rate is estimated from laboratory and field measurements of the environ- 
mental radioactivity. 

K-feldspar has two advantages over quartz for optical dating: (1) the IRSL signal 
(per unit absorbed dose) is usually much brighter than the OSL signal from quartz; 
and (2) the IRSL traps saturate at a much higher dose than do the OSL traps, 
which makes it possible to date older samples using feldspars than is feasible using 
the OSL signal from quartz. However, the routine dating of K-feldspars using the 
IRSL signal has been hampered by the malign phenomenon of ‘anomalous fading’ 
(that is, the leakage of electrons from IRSL traps at a faster rate than expected 
from kinetic considerations*®), which gives rise to substantial underestimates of 
age unless an appropriate correction is made*’. Recently, IRSL traps that are less 
prone to fading have been identified”, using either a post-infrared IRSL (pIRIR) 
approach*!? or a MET-pIRIR procedure!*, The progress, potential and remain- 
ing problems in using these pIRIR signals for dating have been reviewed recently!”. 

Dating the samples from Talepu using quartz OSL is impractical because of the 
paucity of quartz. Furthermore, the quartz OSL traps are expected to be in satu- 
ration, owing to the ages of the samples (>100 kyr) and the high environmental 
dose rates of the deposits (4-5 Gy/kyr). In this study, we applied the MET-pIRIR 
procedure to K-feldspar extracts from Talepu to isolate the light-sensitive IRSL 
signal that is least prone to anomalous fading. We also allowed for any residual 
dose at the time of sediment deposition, to account for the fact that pIRIR traps 
are less easily bleached than the ‘fast’ component OSL traps in quartz. The resulting 
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MET-pIRIR ages should, therefore, be reliable estimates of the time of sediment 
deposition at Talepu. 

The total environmental dose rate for K-feldspar grains consists of four compo- 
nents: the external gamma, beta and cosmic-ray dose rates, and the internal beta 
dose rate. The dosimetry data for all samples are summarized in Supplementary 
Table 4. 

The external gamma dose rates were measured using an Exploranium GR-320 
portable gamma-ray spectrometer, equipped with a 3-inch diameter Nal(TI) crystal 
calibrated for uranium, thorium and potassium concentrations using the CSIRO 
facility at North Ryde“. At each sample location, three or four measurements 
of 300s duration were made of the gamma dose rate at field water content. The 
external beta dose rate was measured by low-level beta counting using a Riso 
GM-25-5 multicounter system” and referenced to the Nussloch Loess (Nussi) 
standard“, The external beta dose rate was corrected for the effect of grain size and 
hydrofluoric acid etching on beta-dose attenuation. These external components 
of the total dose rate were adjusted for assumed long-term water contents of 20% 
for the Talepu Upper Trench (TUT =T2) samples and 30% for the Talepu Lower 
Trench (TLT = T4) sample (TUT and TLT sample numbers refer to the Centre 
for Archaeological Science laboratory numbers). These values are based on the 
measured field water contents (Supplementary Table 4), together with an assigned 
lo uncertainty of +5% to capture the likely range of time-averaged mean values 
over the entire period of sample burial. 

To check the equilibrium status of the 7°8U and ?**Th decay chains, each sam- 
ple was dried, ground to a fine powder and then analysed by high-resolution 
gamma-ray spectrometry (HRGS). The measured activities of 7°8U, Ra and 
210Db in the 2°8U series, 72*Ra and 28Th in the 2°Th series, and *°K are listed in 
Supplementary Table 5. The activities of **Ra and **Th were close to equilibrium 
for all of the samples, as is commonly the case with the "Th series. By contrast, 
the *°°U chain of each sample, except TUT-OSL9, was in disequilibrium at the 
present day. Sample TUT-OSL2 had a 39-45% deficit of ?°Ra and 7!°Pb relative 
to the parental 7°°U activity, whereas sample TUT-OSL3 had a 224-345% excess 
of the daughter nuclides. Samples TUT-OSL1 and TLT-OSL6 had Ra deficits of 
50% and 26%, respectively, relative to their ?°*U activities, but the 7!°Pb activities 
of both samples were similar to their parental ***U activities. 

Sample TUT-OSL3 was the only sample with a present-day excess of 7?°Ra. 
This sample was from a sandy layer (unit B) through which ground water could 
percolate, so we attributed the observed 77°Ra excess to the deposition of radium 
transported by ground water. Given the similar “°U activities of TUT-OSL3 and 
nearby TUT-OSL2, it is reasonable to assume that the parental uranium activity 
had not changed substantially during the period of burial of either sample, and 
that the *°Ra excess in TUT-OSL3 most probably occurred recently. The latter can 
be deduced from the fact that ?°Ra has a half-life of ~1,600 years, which is short 
relative to the ages of our samples (>100 kyr), so any unsupported excess of °Ra 
would have decayed back into equilibrium with 7°8U within ~8 kyr of deposition 
(that is, five half-lives of Ra). The alternative option—that groundwater has con- 
tinuously supplied excess °Ra to unit B—is not supported by the disequilibrium 
between 7”°Ra and ?!°Pb: the latter nuclide has a half-life of ~22 years, so it should 
remain in equilibrium with ?”°Ra if the latter is supplied continuously and no radon 
gas is lost to atmosphere. Moreover, as the return of 7!°Pb to equilibrium with 7°Ra 
is governed by the half-life of the shorter-lived nuclide, it could be argued that the 
excess ”°Ra was deposited within the past ~110 years (five half-lives of *!°Pb). 

Fortunately, the calculated age of TUT-OSL3 is not especially sensitive to dif- 
ferent assumptions about the timing or extent of disequilibria in the ***U series. 
The latter accounts for only 28% of the total dose rate estimated from the HRGS 
data in Supplementary Table 5; this assumes that the present-day nuclide activities 
have prevailed throughout the period of sample burial. If, instead, as we consider 
more likely, the observed excess in 7”°Ra was deposited recently and the *°U decay 
chain had been in equilibrium for almost all of the period of sample burial, then 
the *°U series accounted for only 12% of the total dose rate (that is, using activities 
of 37 £4Bqkg! for °°U, Ra and *!°Pb). The ages calculated under these two 
alternative scenarios, using only the HRGS data for estimating external beta and 
gamma dose rates, range from ~118 kyr to ~143 kyr (Supplementary Table 5). 

Sample TUT-OSL2 was from the more silty overlying layer (sub-unit A7) and 
had deficits of ??°Ra and 7!°Pb relative to “°U, but these disequilibria were much 
smaller in magnitude than those of TUT-OSL3. If it were not continuously leached 
from the sample, Ra will return to secular equilibrium with 7**U within ~8 kyr, 
so the existence of disequilibrium in TUT-OSL2 adds further weight to the argu- 
ment for recent transport of Ra in ground water at Talepu. The alternative is that 
?26Ra has been leached continuously from this sample, so we performed the same 
sensitivity test on the dose rates and ages as that performed on TUT-OSL3. For 
TUT-OSL2, the ages determined using the present-day HRGS data or activities of 
41+3Bqkg! for *8U, ?°Ra and 7!°Pb are statistically indistinguishable (130 + 12 
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and 125+ 11 kyr, respectively; Supplementary Table 5), because the disequilibria 
are much less marked than in TUT-OSL3 and the °8U series makes only a small 
contribution (10-14%) to the total dose rate of TUT-OSL2. 

Samples TUT-OSL1 and TLT-OSL6 had deficits of 77°Ra relative to 7*°U, but 
similar activities of **U and *!°Pb. The latter additionally strengthens our propo- 
sition that ?*°Ra was leached from these sediments recently, because 7!°Pb should 
return to a state of equilibrium with 7”°Ra within ~110 years (five half-lives of 
210pb). For both samples, the ages calculated using the present-day HRGS data 
were statistically concordant with those estimated by assuming that the 7°°U 
chain had been in secular equilibrium for almost the entire period of sample 
burial (Supplementary Table 5). The same applies to sample TUT-OSL9, since the 
measured activities of 77°U, ??°Ra and *!°Pb were consistent at lo. 

To calculate the ages of the Talepu samples, we used the beta dose rates deduced 
from direct beta counting and the in situ gamma dose rates measured at each sam- 
ple location. The external beta dose rates determined from beta counting and from 
the HRGS data (Supplementary Table 5) were statistically consistent (at 2c) for all 
five samples; such agreement is expected, as both measure the present-day activi- 
ties. The field gamma dose rates are also based on the nuclide activities prevailing 
at the time of measurement (*!“Bi, a short-lived nuclide between 77°Ra and 7!°Pb, 
being used for the *°°U series) and—importantly—take into account any spatial 
heterogeneity in dose rate from the ~30 cm of deposit surrounding each sample. 

The in situ gamma dose rates for samples TUT-OSL1 and TLT-OSL6 were 
consistent at 1o with those estimated from the HRGS activities, whereas the 
field gamma dose rates for TUT-OSL2, -OSL3 and -OSL9 were either higher or 
lower than those calculated from the HRGS data. The lower in situ gamma dose 
rate of TUT-OSL3 can be explained by the location of this sample close to the 
boundary with the TUT-OSL2 sediments, which have a smaller beta dose rate 
(Supplementary Table 4), and vice versa for the elevated field gamma dose rate of 
the latter sample. This result also indicates that the 7"°Ra and ?!°Pb deficits (TUT- 
OSL2) and excesses (TUT-OSL3) were spatially localized and not pervasive in the 
30cm of deposit surrounding these samples. 

Under dim red laboratory illumination, the collected samples (see Methods) 
were treated with hydrochloric acid and hydrogen peroxide solutions to remove 
carbonates and organic matter, then dried. Grains of 90-180 or 180-212 {1m in 
diameter were obtained by dry sieving. The K-feldspar grains were separated 
from quartz and heavy minerals using a sodium polytungstate solution of density 
2.58gcm %, and etched in 10% hydrofluoric acid for 40 min to clean the surfaces of 
the grains and remove (or greatly reduce in volume) the external alpha-irradiated 
layer of each grain. For each sample, 8-14 aliquots were prepared by mounting 
grains as a 5-mm-diameter monolayer in the centre of a 9.8-mm-diameter stainless 
steel disc, using ‘Silkospray’ silicone oil as the adhesive. This resulted in each aliquot 
consisting of several hundred K-feldspar grains. 

The single-aliquot regenerative-dose (SAR) MET-pIRIR procedure introduced 
in ref. 16 was adapted for the Talepu samples in this study. We modified the orig- 
inal procedure by using a preheat at 320°C (rather than 300°C) for 60, to avoid 
significant influence from residual phosphorescence while recording the MET- 
pIRIR signal at 250°C (Supplementary Table 6). In addition, following ref. 47, we 
used a 2h solar simulator bleach before each regenerative dose cycle, instead of the 
high-temperature infrared bleaching step used originally, as this proved essential 
for recovering a given laboratory dose (see below). 

Example IRSL (50°C) and MET-pIRIR (100-250 °C) decay curves are shown 
in Extended Data Fig. 6a for an aliquot of sample TUT-OSL2. The decay curves 
observed at the different stimulation temperatures are similar in shape, with 
initial MET-pIRIR signal intensities of the order of a few thousand counts per 
second. Extended Data Fig. 6b shows the corresponding dose-response curves 
for the same aliquot. Each sensitivity-corrected (L,/T,) dose-response curve 
was fitted using a single saturating-exponential function of the form I= Ip 
qd- exp? Po), where J is the L,/T;, value at regenerative dose D, Ip is the satura- 
tion value of the exponential curve and Dp is the characteristic saturation dose. 
The Dp values are shown next to each dose-response curve in Extended Data 
Fig. 6b. For a total of 38 aliquots drawn from all 5 samples, we calculated the 
Dy values for the 250°C MET-pIRIR signal; these are plotted in Extended Data 
Fig. 6c. On a ‘radial plot’ such as this, the most precise estimates fall to the right and 
the least precise to the left. If these independent estimates are statistically consistent 
with a common value at 20, then 95% of the points should scatter within a band 
of width +2 units projecting from the left-hand (‘standardized estimate’) axis to 
the common value on the right-hand, radial axis. The radial plot thus provides 
simultaneous information about the spread, precision and statistical consistency 
of experimental data**°°. The measured Dy values range from ~220 to ~600 Gy, 
with the vast majority consistent at 20 with a common value of ~360 Gy. The 
average Dp value (calculated using the central age model) is 358 + 14 Gy, with the 
standard error taking the extent of overdispersion (16 + 4%) into account. If we 
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adopt the Dp values corresponding to 90% (2.3Dp) and 95% (3Dp) of the saturation 
level of the typical dose-response curve as the upper limits for reliable estimation 
of D.“°*7!7, then the maximum reliable D, values that we can determine using 
the 250°C MET-pIRIR signal are ~820 Gy and ~1070 Gy, respectively, for these 
samples. 

To validate whether the MET-pIRIR procedure is applicable to the Talepu sam- 
ples, we conducted dose recovery, anomalous fading and residual dose tests. For 
the latter, four aliquots of each sample were bleached for 4-5h using a Dr Hénle 
solar simulator (model UVACUBE 400). The residual doses were then estimated 
by measuring these bleached aliquots using the modified MET-pIRIR procedure 
(Supplementary Table 6). The residual doses obtained for each of the TUT samples 
are plotted against stimulation temperature in Extended Data Fig. 7a. The IRSL 
signal measured at 50°C has a few grays of residual dose, which increases as the 
stimulation temperature is raised, attaining values of 16-20 Gy at 250°C. The size of 
the residual dose is only about 2-3% of the corresponding D, values for the 250°C 
signal, which were subtracted from the D, values for the respective samples before 
calculating their ages. 

It was noted in ref. 52 that a simple subtraction of the residual dose from the 
apparent D, value could result in underestimation of the true D, value if the resid- 
ual signal is large relative to the bleachable signal. Accordingly, it advocated the 
use of an ‘intensity-subtractio procedure instead of the simple ‘dose-subtraction 
approach for samples with large residual doses. The dose-subtraction approach 
should be satisfactory for the Talepu samples, however, given the small size of 
the residual doses compared with the D, values obtained from the MET-pIRIR 
250°C signal. 

A dose recovery test*” was conducted on sample TUT-OSLI. Eight aliquots 
were bleached by the solar simulator for 5h, then given a ‘surrogate natural’ dose 
of 550 Gy. Four of these aliquots were measured using the original MET-pIRIR 
procedure’®, with a ‘hot’ infrared bleach of 320°C for 100s applied at the end of 
each SAR cycle (step 15 in Supplementary Table 6). The other four aliquots were 
measured using the modified MET-pIRIR procedure (Supplementary Table 6), 
with a solar simulator bleach of 2h used at step 15. The measured doses at each 
stimulation temperature were then corrected for the corresponding residual 
doses (Extended Data Fig. 7a), and the ratios of measured dose to given dose were 
calculated for the IRSL and MET-pIRIR signals. 

The dose recovery ratios are plotted in Extended Data Fig. 7b, which shows 
that a hot bleach at the end of each SAR cycle results in significant overestimation 
of the known (given) dose; for the MET-pIRIR 250°C signal, an overestimation 
of 48% was observed. For these same four aliquots, we obtained a ‘recycling ratio’ 
(the ratio of the L,/T, signals for two duplicate regenerative doses) consistent with 
unity (1.00 + 0.03), which indicates that the test-dose sensitivity correction worked 
successfully between regenerative-dose cycles. The overestimation in recovered 
dose, therefore, implies failure of the sensitivity correction for the surrogate natural 
dose: that is, the extent of sensitivity change between measurement of the surrogate 
natural and its corresponding test dose differs from the changes occurring in the 
subsequent regenerative-dose cycles. The surrogate natural and regenerative-dose 
cycles differ only in respect to the preceding bleaching treatment (that is, a solar 
simulator bleach was used for the former and a hot bleach for the latter), so we 
compared these results with those obtained for the four aliquots that were bleached 
at the end of each regenerative-dose cycle using the solar simulator. The dose 
recovery results improved significantly using this modified procedure (Extended 
Data Fig. 7): all of the measured/given dose ratios were consistent with unity (at 
20) for the signals measured at different temperatures, with a ratio of 1.02 + 0.03 
obtained for the MET-pIRIR 250°C signal. 

The results of the dose recovery test on sample TUT-OSL1 suggest that 
the MET-pIRIR procedure could successfully recover a known dose given to 
K-feldspars from Talepu, but only when a solar simulator bleach was applied at 
the end of each SAR cycle. We therefore adopted this procedure to measure the D, 
values for all five Talepu samples. 

Previous studies of pIRIR signals have shown that the anomalous fading rate 
(g value) depends on the stimulation temperature, with negligible fading of MET- 
pIRIR signals stimulated at temperatures of 200°C and above'®'”. Accordingly, 
no fading correction is required for these high-temperature MET-pIRIR signals. 
To check that this finding also applied to the Talepu samples, fading tests were 
conducted on six aliquots of sample TUT-OSL3 that had already been used for D, 
measurements. We adopted a single-aliquot procedure similar to that described 
in ref. 53, but based on the MET-pIRIR signals. Doses of 110 Gy were admin- 
istered using the laboratory beta source, and the irradiated aliquots were then 
preheated and stored for periods of up to 1 week at room temperature (~20°C). 
For practical reasons, we used a hot bleach (320°C for 100s) instead of a solar 
simulator bleach at the end of each SAR cycle, but this choice should not have 
affected the outcome of the fading test, given the aforementioned recycling ratio of 
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unity obtained using the hot bleach. Extended Data Fig. 7c shows the decay in the 
sensitivity-corrected MET-pIRIR signal as a function of storage time for these six 
aliquots, normalized to the time of prompt measurement (which ranged from 720s 
for the 50°C IRSL to 1480s for the 250°C MET-pIRIR signal). The corresponding 
fading rates (g values) were calculated for the IRSL and MET-pIRIR signals 
(Extended Data Fig. 7d). The highest fading rate was observed for the 50°C IRSL 
signal (5.5 + 0.4% per decade), and decreases as the stimulation temperature is 
increased, falling to 0.94 + 0.92 and 0.17 + 1.13% per decade for the 200 and 250°C 
signals, respectively. The latter g value is consistent with zero at 1a, so we used the 
D, value obtained from the 250°C signal to date each of the samples. We note, 
however, that the g values for the 200 and 250°C signals have large uncertainties, 
owing to the difficulty in obtaining precise estimates at low fading rates, so our data 
do not exclude the possibility that the high-temperature signals may fade slightly. 

On the basis of the results of the performance tests described above, the MET- 
pIRIR procedure in Supplementary Table 6 was used to estimate the D, values 
for all four TUT samples, as well as one sample (TLT-OSL6) collected from near 
the base of the stratigraphically underlying deposits in the TLT. The D, estimates 
obtained for the TUT samples using the MET-pIRIR 250°C signal are shown in 
Extended Data Fig. 8. Most of the estimates are distributed around a central value, 
although the spread is larger than can be explained by the measurement uncer- 
tainties alone. The overdispersion among these D, values is ~20% for three of the 
TUT samples and almost twice this amount for TUT-OSL9, the latter arising from 
a pair of low D, values measured with relatively high precision. To estimate the age 
for each of these samples, we determined the weighted mean D, of the individual 
single-aliquot values using the central age model”, which takes account of the 
measured overdispersion in the associated standard error. 

As a further test of the reliability of our D, estimates for the TUT samples, we 
have plotted the estimates of the central age model as a function of stimulation tem- 
perature in Extended Data Fig. 9a. These plots show that the D, values increase with 
stimulation temperature until a ‘plateau’ is reached at higher temperatures for each 
of the TUT samples; the plateau region (marked by the dashed line) indicates that 
a non-fading component is present at these elevated temperatures. The existence 
of a plateau can be used, therefore, as an internal, diagnostic tool to confirm that 
a stable, non-fading component has been isolated for age determination. For all 
four TUT samples, a plateau is reached at temperatures of 200°C and above, from 
which we infer negligible fading of the MET-pIRIR 250°C signal. We calculated 
the sample ages, therefore, using the D, values obtained from the 250°C signal. The 
corresponding weighted mean D, values, dose rate data and final ages are listed 
in Supplementary Table 4. 

For sample TLT-OSL6 from the TLT, four of the eight aliquots measured emitted 

natural MET-pIRIR 250°C signals consistent with the saturation levels of the corre- 
sponding dose-response curves (for example Extended Data Fig. 9b). This implies 
that the IRSL traps were saturated in the natural sample, which further supports 
our conclusion that the MET-pIRIR 250°C signal had a negligible fading rate. It 
would be hazardous to estimate the age of sample TLT-OSL6 from the D, values 
of the four non-saturated aliquots, as these may represent only the low D, values 
in the ‘tail’ of a truncated distribution. If we adopt the average 2.3Dp value for the 
MET-pIRIR 250°C signal of all five Talepu samples (~820 Gy) as an upper limit 
for reliable D, estimation, then this corresponds to a minimum age of ~195 kyr 
for sample TLT-OSL6 (Supplementary Table 4). 
K-feldspar chronology. The MET-pIRIR 250°C ages for the four samples dated 
from the TUT (=T2) are in correct stratigraphic order, increasing from 103 + 9 kyr 
(at ~3 m depth) to 156 + 19 kyr (at ~10m depth). They thus span the period from 
marine isotope stage 6—the penultimate glacial—to marine isotope stage 5, the 
last interglacial. This coherent sequence of ages also supports our contention that 
the Talepu samples were sufficiently bleached before deposition. 

The sample analysed from ~8 m depth in the TLT (=T4; sample TLT-OSL6) 
yielded a minimum age of ~195 kyr, corresponding to marine isotope stage 7 (the 
penultimate interglacial) or earlier. We have not yet dated the other sediments 
exposed in the TLT, but expect that the 6 m of deposit immediately overlying TLT- 
OSL6 will be older than 156 + 19 kyr, as they stratigraphically underlie sample 
TUT-OSL9 in the TUT. 

We interpret the ages for the TUT samples as true (finite) depositional ages, 
based on the existence of D, plateaux (Extended Data Fig. 9a) and the increase in 
D, with depth (that is, ordered stratigraphically). This is the most parsimonious 
reading of our data. The measured fading rate of 0.17 + 1.13% per decade for sam- 
ple TUT-3 allows for the possibility, however, that the MET-pIRIR 250°C signal 
may still fade slightly and that our samples had reached an equilibrium state of 
trap filling and emptying (so-called field saturation™). If so, then the increase in 
D, with depth could, instead, be due to a systematic decline in fading rate with 
increasing depth. Any such a trend cannot be verified or rejected from laboratory 
measurements of the g value, owing to the size of the associated uncertainties at 
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low fading rates (Extended Data Fig. 7c, d). The ages for the TUT samples could, 
therefore, be viewed conservatively as minimum ages (as for sample TLT-OSL6), 
given the uncertainties in the measured fading rate of the 250°C signal and the 
exact level at which the signal saturates. The measured age of the uppermost sample 
in the sequence, TUT-OSLI1, would increase by about 15% and 40% after correct- 
ing*?°>°S for assumed fading rates of 0.5 and 1% per decade, respectively. Similarly, 
the measured ages of TUT-OSL2, -3 and -4 would increase by about 17, 23 and 
28%, respectively, after correcting for an assumed fading rate of 0.5% per decade. 
Thus, whether viewed as true ages or as minimum ages, the TUT sediments were 
deposited more than ~100ka. 

Palaeomagnetic dating. Samples for palaeomagnetic polarity assessment were 
taken from the baulks of excavations Talepu 2 (T2) and Talepu 4 (T4) (Fig. 2). 
Samples were taken at 20-30 cm intervals using non-magnetic tools. Preferably 
samples in non-bioturbated silty deposits were taken. The upper conglomeratic 
interval of T2 was omitted because of its coarser grain size and because it appeared 
heavily affected by soil formation and plant root bioturbation. From each sample 
level, five oriented sample specimens were retrieved by carving the sediment using 
non-magnetic tools and fitting them into 8cm? plastic cubes. The samples were 
labelled according to excavation, baulk and depth. 

In the laboratory all specimens were treated by an alternating field demag- 
netizer. The mean magnetic directions for each sample are presented in 
Supplementary Table 7. Demagnetization was performed with intervals of 
2.5-5 mT to a peak of up to 80-1,000 mT. The magnetization vectors obtained 
from most samples showed no more than two separated components of natural 
remanent magnetization (NRM) on the orthogonal planes, which means that the 
specimens had been affected by secondary magnetization. However, secondary 
magnetization was easily removed with a demagnetization of up to 5-20 mT, while 
the characteristic remanent magnetizations (ChRMs) could be isolated through 
stepwise demagnetization of up 20-40 mT, in some cases up to 50 mT. Above 
40 mT most samples were completely demagnetized (Extended data Fig. 4j and 
Supplementary Table 8). 

The mean magnetization intensities and palaeomagnetic directions are plotted 
against stratigraphic depth in Extended Data Fig. 5. The 90-98% intensity sat- 
uration was achieved from 1.30 x 10~* to 3.81 x 10-7 Am _! before demagneti- 
zation, and between 8.52 x 10° and 1.49 x 104A m_ | after demagnetization at 
20-40 mT. The direction of ChRMs is determined from the orthogonal plots in 
at least four or five successive measurement steps between 20 and 50 mT using 
principal component analysis*’ (PuffinPlot** and IAPD 2000 software°’) with 
the maximum angular deviations setting at <5°. Although there are no well- 
defined criteria for the acceptability of palaeomagnetic data available, the k > 30 
and «95 < 15° criteria of ref. 60 were used to accept the average remanence 
direction for sampled levels. On the basis of these tests, all the samples (n = 24) 
throughout the Talepu sequences yielded acceptable ChRMs directions and 
showed a normal polarity. The ChRM directions were relatively constant through- 
out the sequences, except the direction of samples taken in T2 at 6.5 and 7.5m 
depth, which showed steep inclinations of 56-68°. Such steep inclinations are 
unusual for near-equatorial regions. One possible interpretation is that post-dep- 
ositional mass-movement disturbances, such as creep or a landslide, resulted in 
rotational movements of this interval. 

The equal-area projections show that the dispersion of within-site means of 
the remanence directions re-group more closely together after demagnetization, 
and no significant change in the major remanence direction occurs with depth. 
The major remanent direction corresponds closely with the present magnetization 
direction (Extended Data Fig. 5b). 
40Ar/*° Ar fusion dating of single sanidine crystals. Sample TAL-10-01 was taken 
from T4, sub-unit E>, at a depth of 2.5 m below the surface. Euhedral sanidine 
crystals up to 250|1m in length were hand-picked following standard heavy liquid 
and magnetic separation techniques. Crystals were loaded into wells in aluminium 
sample discs (diameter 18 mm) for neutron irradiation, along with the 1.185 Myr 
Alder Creek sanidine®! as the neutron fluence monitor. Neutron irradiation was 
done in the cadmium-shielded CLICIT facility at the Oregon State University 
TRIGA reactor. Argon isotopic analyses of gas released by CO, laser fusion of 
single sanidine crystals (Supplementary Table 9) were made on a fully automated, 
high-resolution, Nu Instruments Noblesse multi-collector noble-gas mass spec- 
trometer, using procedures documented previously. Sample gas clean-up was 
through an all-metal extraction line, equipped with a —130°C cold trap (to remove 
H,0) and two water-cooled SAES GP-50 getter pumps (to absorb reactive gases). 
Argon isotopic analyses of unknowns, blanks and monitor minerals were per- 
formed in identical fashion.“°Ar and *Ar were measured on the high-mass ion 
counter, **Ar and *”Ar on the axial ion counter and *°Ar on the low-mass ion 
counter, with baselines measured no less than every third cycle. Measurement 
of the *°Ar, *8Ar and *°Ar ion beams was performed simultaneously, followed by 
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sequential measurement of * Ar and *7Ar. Beam switching was achieved by varying 
the field of the mass spectrometer magnet and with minor adjustment of the quad 
lenses. Data acquisition and reduction was performed using the program “Mass 
Spec (A. Deino, Berkeley Geochronology Center). Detector intercalibration and 
mass fractionation corrections were made using the weighted mean of a time series 
of measured atmospheric argon aliquots delivered from a calibrated air pipette. 
Decay and other constants, including correction factors for interference isotopes 
produced by nucleogenic reactions, are as reported in ref. 62. 

The resulting age probability diagram for single sanidine crystals (Extended 
Data Figure 10) shows a wide range in ages with a dominant population around 
9.4 million years ago (Late Miocene). This indicates that the sanidine crystals from 
the sample do not represent a single volcanic event, but were predominantly derived 
from erosion of the Miocene volcanic rocks west of the Walanae Depression and/ 
or from Late Miocene marine sediments of the Walanae Formation. 
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Extended Data Figure 1 | Research location and geology of southwest 
Sulawesi and Talepu. a, Digital elevation model of southwest Sulawesi. 

1: Talepu; 2: Maros karst area. Area enclosed by rectangle shown in d; 
map data: copyright USGS/NASA SRTM (2007). b, Geological map of 
southwest Sulawesi (data from refs 11, 63, 64). The Walanae Depression 
(WD) is an elongate fault-bounded basin (also known as the West 
Sengkang Basin) separated from the Bone Mountains to the east by a 
major fault, the East Walanae fault (EWF), which formed in response to 
east-west compression and strike-slip movements along the Walanae fault 
zone. To the west the basin is bordered by the Western Dividing Range, 
consisting of uplifted Miocene volcanics deposited in a shallow marine 
environment!3, The Walanae Depression basin infill consists of a several- 
kilometre-thick regressive sequence, named the Walanae Formation. In 
the southern part of the Walanae Depression, the Walanae Formation is 
folded and deformed by Pleistocene compression, whereas to the north 
near Lake Tempe deposition continues to the present day. 1: Talepu; 2: 
Paroto (alluvial terrace of the Walanae River); 3: Beru’; 4: Tanrung 

River (palaeontological site: coastal terrace deposits'’); 5: Sompe®*"°; 

6: Celeko*-!°; 7; Maros karst archaeological sites°. c, Schematic 
stratigraphic scheme for the northern Sengkang Basin at the latitude of 
the Talepu site (green dotted line in b). The Walanae Formation basin 

fill represents a regressive sequence that was strongly influenced by 
tectonic movements along the Walanae fault zone. The youngest unit of 
the Walanae Formation is the Beru member (deltaic sands, clays and 
gravels), which contains fossil vertebrate remains of the Walanae Fauna!!. 
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The lower part of the Beru member (unit A) is characterized by 
sedimentary structures indicative of shallow marine/estuarine/fluvial 
depositional environments. The upper part of the Beru member (unit B) 
consists of fully terrestrial fluvio-lacustrine deposits, which merge into the 
modern floodplain along the depocentral axis of the Walanae Depression. 
The coarser-grained unit B of the Beru member was not deposited in the 
Sengkang anticline, which started to rise during the Middle Pleistocene, or 
in the southern portion of the Walanae Depression south of Talepu. East 
of the Walanae fault Zone, in the East Sengkang Basin, uplift and folding 
during the Pliocene caused a depositional hiatus. Here Late Miocene 
deformed marine deposits of the Walanae Formation are unconformably 
overlaid by a conglomerate up to 5 m thick, the Tanrung Formation, which 
contains a distinct fossil vertebrate fauna'!. During the Middle and Late 
Pleistocene, uplift of the Western Dividing Range generated the formation 
of alluvial fans and influxes of coarse-grained boulder conglomerates 

into the Walanae Depression. d, Geological map of Talepu area with sub- 
horizontal layering, fault-bounded to the east by steeply west-dipping 
strata of the Sengkang anticline. 1: Modern alluvium; 2: Late Pleistocene 
alluvial terrace; 3-7: lithological sub-units of the Walanae Formation: 3: 
fluvio-lacustrine facies of the upper part of the Beru member; 4: fluvio- 
estuarine facies of the lower part of the Beru member; 5: shallow marine 
facies of the Samaoling member; 6: deep marine facies of the Burecing 
member; 7: coral reef facies of the Tacipi member; 8: strike and dip; 9: sub- 
horizontal layering; 10: major fault; 11: sites with surface-collected stone 
artefacts; 12: sites with in situ stone artefacts; 13: fossil vertebrate localities. 
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excavation 4 in 2010. View towards the North Baulk. e, Talepu excavation 


the east of the north-south trending ridge. Talepu is located behind the 2 in 2012. Photograph shows the North Baulk at 4-5 m depth, with, in the 
palm trees on the left. b, View towards the East Baulk of Talepu excavation _ upper part, the base of gravel unit A7 and the three holes left by sampling 
2 in 2009. In 2010 the 1 m x 2m excavation was extended toa 2m x 2m for optical dating. 


excavation. c, The 12 m deep Talepu excavation 2 in 2012. d, Talepu 
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Extended Data Figure 3 | Composition of gravels and distribution 

of stone artefacts in T2, and size distribution of recent anoa molars. 

a, Gravel compositions based on pebble counts (200 pebbles per level). 
Overall, the composition of the gravel is dominated by volcanic pebbles, 
which become more abundant with increasing depth, probably as a result 
of less intense weathering further down (near the surface, the volcanic 
clasts are frequently weathered to a crumbly clayey ‘ghost’). Note the 
increase both in weathering-resistant silicified rock pebbles and in heavily 
weathered indeterminable clasts towards the top of the sequence. b, Total 
number of artefacts per 10 cm spit (black triangles), total amount of gravel 
clasts per spit (red graph) and the maximum clast diameter per spit (blue 
graph: values represent the mean maximum clast diameter of the ten 
largest clasts). Note the higher concentration of both gravel and artefacts 
in the topsoil: pebbles and artefacts are concentrated by winnowing of 


11 12 13 14 


DTbase 


sand and clay by sheetwash processes. c, Detail of the topsoil as exposed in 
the west baulk of excavation T2. d, Detail of the basal sequence as exposed 
in the north baulk of excavation T2. Note the cross-bedded foresets of the 
pebbly sand of sub-unit D2, with inter-bedded mud laminae, indicative 

of tidal activity. Diameter of the round sample hole (for optical dating) 

is ~10cm. e, Histogram of the transverse diameter measurements (in 
millimetres) of recent lowland anoa (Bubalus [Anoa] depressicornis) lower 
molars measured in the collections of the Naturalis Biodiversity Centre, 
Leiden, the Netherlands (n= 32). The lowland anoa is the largest living 
anoa, bigger in body size than the mountain anoa, Bubalus [Anoa] quarlesi. 
The lower molar fragment from unit A of the Talepu-2 excavation (Fig. 3t) 
has a preserved basal transverse diameter of 14.4mm and an estimated 
basal transverse diameter of 15.5 mm, slightly above the size range for 
extant lowland anoa. 
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Extended Data Figure 4 | Fossil samples used for uranium-series dating 
and demagnetization results of representative palaeomagnetic samples. 


Faunal remains from Talepu excavation 4, used for uranium-series laser 


ablation dating (a-i) and representative NRM intensity plots of progressive 


demagnetization (j). a-i, Close-ups of the surface sections for each fossil 
with the laser spot profiles. All fossils originate from excavation T4, 
sub-unit E2. Scale bars next to fossils are 2cm; white scale bars in 
close-ups are 1mm. Numbers between brackets are Australian National 
University laboratory numbers. a, Specimen TLP10-F8 (ANU-2946), 
Celebochoerus upper left first incisor; laser ablation transect on cut 
section of root. b, c, Two sections measured on different transects of the 
same specimen, TLP10-F1 (ANU-2947 and ANU-2948), a Celebochoerus 
lower left canine. d, Specimen TLP10-F7 (ANU-2951), rib fragment 

of Celebochoerus; e, Specimen TLP10-F4 (ANU-2954), Celebochoerus 
upper left third molar; laser ablation transect on cut section of root; 
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f, Specimen TLP10-F3 (ANU-2956), Celebochoerus upper right third 
molar (same individual as previous); laser ablation transect on cut section 
of root. g, Specimen TLP10-F9 (ANU-2955), Celebochoerus upper molar 
fragment; laser ablation transect on cut section of enamel and dentine. 
h, Specimen TLP10-F6 (ANU-2949), bone fragment. i, TLP10-F2 
(ANU-2942), Celebochoerus upper left fourth premolar; laser ablation 
transect on cut section of root. j, NRM intensity plot of progressive 
demagnetization (upper left), equal area projections (middle left) and 
vector end-point demagnetization orthogonal plots (bottom left) for two 
Talepu palaeomagnetic samples (T2-510-1, T4-180-4). To the right the 
demagnetization curve for an additional sample (T2-320-4) is given. The 
inset shows the zoomed-out trajectory endpoints of sample T2-320-4. 
Open squares on the equal area projection diagrams indicate an upper 
hemisphere magnetic direction. 
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Extended Data Figure 5 | Lithological and magnetic properties against 
depth for the composite stratigraphic column at Talepu. a, Columns 


from left to right show lithology, 


Equal-area projections of 


b, 


NRM and ChRM directions for all sampled levels, and the mean direction 


NRM magnetic 


> 


sand/silt/clay ratios 


(circles with crosses; the mean is of all sampled levels except the two levels 
with deviating inclinations: n = 22) and present-day magnetic direction in 


the area (red crosses). 


intensities before and after demagnetization, and magnetic declination and 
inclination directions. The intensities before and after demagnetization 


represent averages for each sampled level. Declination and inclination 
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response curves for the IRSL (50°C) and MET-pIRIR (100-250°C) signals The grey band shows the mean of the Dp values. 
for the same aliquot. The natural signals are shown on the x axis using the 
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Extended Data Figure 7 | Results from residual dose, dose recovery and 
anomalous fading tests. a, Residual doses measured for bleached aliquots 
of the four samples from the upper trench, plotted against stimulation 
temperature. Each data point represents the mean and standard error for 
four aliquots. b, Results of the dose recovery test conducted on sample 
TUT-OSLI1. The measured/given dose ratios are shown for the IRSL and 
MET-pIRIR signals at the different stimulation temperatures. Each data 
point represents the mean and standard error for four aliquots. The data 
shown in red squares were obtained using a hot IR bleach at the end of 
each SAR cycle, as per the conventional MET-pIRIR procedure. The 

data shown in black circles were obtained with the modified MET-pIRIR 
procedure (Supplementary Table 4), using a solar simulator bleach instead 
of a hot IR bleach. The dashed line denotes a ratio of unity, and the solid 
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lines indicate ratios 10% larger and smaller than unity. The data (circles) 
obtained using the modified MET-pIRIR procedure fall within the latter 
band. c, Decay of the sensitivity-corrected IRSL and MET-pIRIR signals 
with their standard errors of six aliquots from TUT-OSL3, plotted against 
log(t/t.) where t is the delayed period for each measurement and t- is the 
time for the first measurement (f= 720, 870, 1,040, 1,240 and 1,480s for 
the signals measured at 50, 100, 150, 200 and 250°C, respectively). The 
sensitivity-corrected signals were normalized to the first measurements. 
d, Anomalous fading rates (g values) and their standard errors for the 
IRSL and MET-p IRIR signals of TUT-OSL3 obtained using the data sets 
in c, plotted against stimulation temperature. All the g values have been 
normalized to a delay time of 2 days. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


a 1250 b 
1250 
TUT-OSL1 TUT-OSL2 
D, = 536 + 42 Gy D, = 587 + 45 Gy 
= 850 = 9 
ei OD=19+6% 2 OD=2146% 
z E . 
5 e 650 Bo re = 
42 e oO = A 650 § 
3 e ee a 2.8 ee ° e 3 
2? ° QO 2 ° 2 
£2 e = oS e 
me} 
3 ‘ 450 c e 450 
Bl ao 
7) 
350 350 
i 9, 
Relative Error (%) 250 Relative Error (%) 250 
36 18 12 9 
36 18 12 9 —— 
0 4 8 12 16 aA 
i Precision 
Precision 
c d 2450 
1650 
TUT-OSL3 TUT-OSL9 2050 
D, = 682 + 48 Gy 1250 D.= 760 + 83 Gy 1650 
OD = 224+5% OD=384+8% 
2 Q 1250 
J oO 
£ ® pao £ 
ul : o % 0 = 
> 2 oc: a 2 " e 850 § 
a 0} ee . ° 60 @ 22 ee 3 
6-2 Ge @ AS 22 e 9 
w = 650 S 
2 ’ 2 ° : 
Ba & 
” 450 oO e vr 
350 350 
Relative Error (%) 250 Relative Error (%) 250 
24 12 8 6 36 18 12 9 
ts 7 a a er 
0 5 10 15 20 ) 4 8 12 16 
Precision Precision 


Extended Data Figure 8 | Radial plots of single-aliquot D, values for the —_D, values estimated using the central age model. The D, estimate and the 
TUT samples. a, TUT-OSL1. b, TUT-OSL2. c, TUT-OSL3. d, TUT-OSL9. overdispersion (OD) value for each D, distribution based on the central 
The grey band in each plot shows the weighted mean of the measured age model are also shown in each plot. 


© 2016 Macmillan Publishers Limited. All rights reserved 


600 


TUT-OSL1 


500 


400 


300 


D, (Gy) 


200 


TUT-OSL3 


600 


400 


D, (Gy) 


200 


0 50 100 150 200 250 


IR stimulation temperature (°C) 


Sensitivity-corrected IRSL 


LETTER 


TUT-OSL2 


100 150 200 250 300 


TUT-OSL9 


600 


400 


200 


300 0 50 


100 
IR stimulation temperature (°C) 


150 200 250 300 


ORegenerative @Natural 


0 300 600 900 


Dose (Gy) 


Extended Data Figure 9 | D, versus temperature plots for the TUT 
samples and the dose-response curve for sample TLT-OSL6. a, Plots of 
the weighted mean D, against stimulation temperature for the TUT samples. 
The dashed line in each plot shows the plateau range of D, values. Each data 
point represents the mean and standard error for 8 (TUT-OSL1), 10 (TUT- 
OSL2), 12 (TUT-OSL3) and 13 (TUT-OSL9) aliquots. b, Dose-response 


1200 


1500 


curve for the sensitivity-corrected MET-pIRIR 250°C signal from an aliquot 
of sample TLT-OSL6. The regenerative-dose data points and their standard 
errors were fitted using a single saturating-exponential function, and the 
best-fit curve is shown as a full line. The natural signal of this aliquot (red 
circle on the y axis) falls in the saturated region of the curve (see dashed 
line), so only a minimum D, can be estimated. 
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Diet-induced extinctions in the gut microbiota 
compound over generations 


Erica D. Sonnenburg"*, Samuel A. Smits!*, Mikhail Tikhonov’, Steven K. Higginbottom!, Ned S. Wingreen*? & 


Justin L. Sonnenburg! 


The gut is home to trillions of microorganisms that have 
fundamental roles in many aspects of human biology, including 
immune function and metabolism)”. The reduced diversity of 
the gut microbiota in Western populations compared to that in 
populations living traditional lifestyles presents the question of 
which factors have driven microbiota change during modernization. 
Microbiota-accessible carbohydrates (MACs) found in dietary fibre 
have a crucial involvement in shaping this microbial ecosystem, 
and are notably reduced in the Western diet (high in fat and simple 
carbohydrates, low in fibre) compared with a more traditional 
diet®. Here we show that changes in the microbiota of mice 
consuming a low-MAC diet and harbouring a human microbiota 
are largely reversible within a single generation. However, over 
several generations, a low-MAC diet results in a progressive loss of 
diversity, which is not recoverable after the reintroduction of dietary 
MACs. To restore the microbiota to its original state requires the 
administration of missing taxa in combination with dietary MAC 
consumption. Our data illustrate that taxa driven to low abundance 
when dietary MACs are scarce are inefficiently transferred to the 
next generation, and are at increased risk of becoming extinct 
within an isolated population. As more diseases are linked to the 
Western microbiota and the microbiota is targeted therapeutically, 
microbiota reprogramming may need to involve strategies that 
incorporate dietary MACs as well as taxa not currently present in 
the Western gut. 

The gut microbiota of hunter-gatherers and populations consum- 
ing a rural agrarian diet is distinct, and contains greater diversity than 
the microbiota of Westerners*° (Extended Data Fig. 1). One possi- 
ble explanation for the greater microbiota diversity seen in hunter- 
gatherers and agrarians is the large quantity of dietary fibre they con- 
sume relative to Westerners*®!9!!, MACs, which are abundant in 
dietary fibre, serve as the primary source of carbon and energy for the 
distal gut microbiota*’”. Therefore, we sought to determine whether 
a diet low in MACs could drive loss of taxa within the gut microbiota. 

Humanized mice (4 weeks old, n = 10) were fed a diet rich in fibre 
derived from a variety of plants (high-MAC) for 6 weeks, and ran- 
domly divided into two groups (Extended Data Fig. 2). One group was 
switched to a low-MAC diet for 7 weeks, after which they were returned 
to the high- MAC diet for 6 weeks (Fig. 1a and Extended Data Table 1). 
The control group was maintained on the high-MAC diet throughout 
the experiment. At the start of the experiment, the microbiota com- 
position from both groups of mice was indistinguishable (P = 0.2, 
Student’s t-test; UniFrac distance; no significant difference in oper- 
ational taxonomic unit (OTU) frequency observed between groups, 
Mann-Whitney U test). The diet-switching mice, while consuming 
the low-MAC diet, had an altered composition relative to controls 
(P=10~?°, Student's t-test; UniFrac distance). Weeks after returning to 
the high-MAC diet, the microbiota of the diet-switching mice remained 


distinct from controls (P=3 x 1078, Student’s t-test; UniFrac distance 
at 15 weeks) (Fig. 1b). To determine whether taxa had been lost over 
the course of the diet perturbation, we focused on a subset of OTUs 
that met stringent measures of prevalence and abundance and could 
be confidently monitored over the course of the experiment (‘high- 
confidence’ OTUs, see Methods). We identified 208 high-confidence 
OTUs in the diet-switching group and 213 high-confidence OTUs in 
the control group (Extended Data Table 2). When mice were switched 
from the high-MAC diet to the low-MAC diet, we observed that 60% of 
taxa (124 out of 208) decreased in abundance at least fourfold compared 
with only 11% of the control group (25 out of 213) (Supplementary 
Table 1). When these mice were returned to a high-MAC diet, 33% 
(71 out of 208) were fourfold less abundant. The control group did not 
change significantly (10% were fourfold less abundant; 22 out of 213) 
(Fig. 1c and Supplementary Table 2). These data reveal two divergent 
qualities of the microbiota. First, 59 of the 208 high-confidence OTUs 
that exhibit diet-induced decline in abundance recovered (were no 
longer at least twofold less abundant) with the reintroduction of MACs 
illustrating microbiota resilience over short time scales (Supplementary 
Table 1). Second, however, the low-MAC-diet perturbation induced 
‘scars’ on the microbiota. 

We proposed that diet-induced microbiota diversity loss would be 
magnified over generations. Mice from the previous experiment con- 
suming the low-MAC-diet or the high- MAC diet were used to generate 
a litter of pups. Pups were weaned onto the respective diets of their 
parents. This breeding strategy was repeated for four generations. For 
each generation, low-MAC-diet parents were switched to the high- 
MAC-diet after their pups were weaned, to see whether taxa that 
became undetectable while MACs were scarce would bloom in the pres- 
ence of MACs (Fig. 2a and Extended Data Fig. 3). At 5 weeks old, mice 
propagated in the low-MAC diet condition (born to low-MAC-diet 
parents and consuming a low-MAC diet) had a lower-diversity micro- 
biota than high-MAC-fed controls (P=3 x 10-°, P=8 x 10-° and 
P=8 x 107°, Student’s t-test, Shannon index; generations two, three 
and four, respectively) (Fig. 2b, top). Even after mice were switched to 
the high-MAC diet for several weeks, their microbiota diversity did not 
recover to control levels (P=2 x 107°, P=1 x 107° and P=1 x 107+, 
Student’s t-test; generations two, three and four, respectively, at week 15) 
(Fig. 2b, bottom). With each generation, the microbiota composition 
of the diet-switching group showed increasing departure from that of 
controls (Fig. 2c). Weaning the diet-switching lineage directly onto 
the high-MAC diet did not correct the diversity loss relative to con- 
trols (P=3 x 10~° Student's t-test, Shannon index), and there was no 
difference in composition between this group and the generation-four 
diet-switching group (P=0.9, Student's t-test; UniFrac distance; week 
13, generation four, and week 5, generation five) (Extended Data Fig. 4). 

Plotting the relative abundance of the high-confidence OTUs 
over time revealed a pattern of taxa loss over generations in the 
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Figure 1 | Taxa reduction observed in low-MAC diet is largely reversible 
in a single generation. a, Schematic of mouse experiment. Humanized 
mice (= 10) were maintained on a high-MAC diet for 4 weeks, after 
which one-half of the mice were switched to a low-MAC diet for 7 weeks. 
These mice were then switched back to the high-MAC diet for 6 weeks. 

b, Principle coordinate (PC) analysis of the UniFrac distance for 16S 
ribosomal RNA amplicon profiles from faecal samples collected from 

the diet-switching mice (yellow, n= 5) and control high-MAC-diet mice 
(green, n=5). ¢, Distribution of OTU fold changes for diet-switching 
(blue, n= 5) or control (red, n=5) groups comparing baseline (4 weeks 
post-humanization) versus week 9 (5 weeks post-low-MAC diet for ‘diet 
switch group; top panel) and baseline versus week 15 (4 weeks after return 
to high-MAC diet for ‘diet switch’ group, bottom panel). 


diet-switching group (Fig. 2d). Specifically, in the diet-switching 
group, generations one, two and three exhibited a progressive loss 
in high-confidence OTUs while consuming the low-MAC diet (72%; 
150 out of 208 lost by generation four, week 15) (Supplementary 
Table 3). In each generation, switching to the high-MAC diet allowed 
for the recovery of a small number of taxa (grey versus yellow high- 
lighted rows within each generation in Fig. 2d), but most did not 
return (141 out of 208 were undetectable by generation four, week 
15) (Supplementary Table 3). Most of the lost taxa (112 out of the 141) 
were from the Bacteroidales order with an additional loss of 26 taxa 
from the Clostridiales, making the Clostridiales the most numerous 
high confidence taxon present in the fourth generation (Extended 
Data Fig. 5a, b). 


LETTER 


To determine whether the carbohydrate degrading capacity of the 
microbiota had been altered over the four generations, we compared 
imputed glycoside hydrolases between the first and fourth generations 
of both the diet-switching and control groups after validating this 
method'*"* (Supplementary Table 4). Although representation of gly- 
coside hydrolase families is not a perfect correlate of specificity for poly- 
saccharide degradation (for example, owing to combinatorial activity or 
polyspecificity within a family), loss of representation within glycoside 
hydrolase families provides one measure of changes in glycan-degrading 
capacity. Twenty-two glycoside hydrolase families showed a loss in 
abundance in the diet-switching group between the initial time point 
in generation one versus generation four, 4 weeks after the switch to 
the high-MAC diet (P < 0.05 plus at least a twofold change, Bonferroni- 
corrected Student's t-test) (Extended Data Fig. 5c and Supplementary 
Table 5). No differences in glycoside hydrolase families were observed 
in the control group. An overall loss in glycoside hydrolase diversity 
occurred between generation-four diet-switching mice on the high- 
MAC diet relative to generation-one mice (P = 0.0002, Shannon 
index of glycoside hydrolase subfamilies; Supplementary Table 6 and 
Extended Data Table 3). No difference in glycoside hydrolase subfam- 
ilies was observed in the control group. These data demonstrate that, 
in addition to the loss of high-confidence OTUs, the diet-switching 
group sustained a widespread and marked loss in glycoside hydrolase 
repertoire over the four generations. Future experiments will be needed 
to reveal the functional consequences of these observations in terms of 
fibre-degrading capacity. 

We next wanted to determine whether low-abundance OTUs that 
bloom when MACs are reintroduced are more likely to be lost owing 
to inefficient inter-generational transmission. Low-abundance taxa 
(average abundance <25 reads) in a given generation were less effi- 
ciently transferred to the next generation (average abundance >10 
reads to be considered present, 4 weeks after high- MAC diet) compared 
to high-abundance taxa (average abundance >25 reads) (P=0.002, 
P=4x 10~ and P=0.01, hypergeometric distribution inheritance 
between generations one and two, two and three, and three and four, 
respectively) (Supplementary Table 7). Notably, overall diversity 
and composition did not change between the third and fourth gen- 
erations (Fig. 2b, c); however, we wondered whether additional loss 
could be obscured owing to lack of resolution of OTUs. Therefore, we 
identified 280 high-confidence sub-OTUs in the control group and 
261 high-confidence sub-OTUs in the diet-switching group using a 
cluster-free filtering approach! (Supplementary Table 8). A similar 
decline in the number of sub-OTUs with each generation was observed 
(114 out of 261 sub-OTUs were undetectable by generation four, week 
15) (Extended Data Fig. 6 and Supplementary Table 8). Most of the lost 
taxa (77 out of the 114) were from the Bacteroidales order with an addi- 
tional loss of 32 taxa from the Clostridiales. Between generation three 
and four, we detected loss of 22 sub-OTUs, compared to four using 
the lower-resolution high-confidence OTUs (Supplementary Table 8). 

Because high dietary MACs were insufficient to restore microbiota 
composition or diversity to control levels, we tested whether reintro- 
duction of lost bacteria was required. Fourth-generation, diet-switching 
mice were gavaged with faecal samples (faecal microbiota transplant 
(FMT) group) from fourth-generation high-MAC-diet controls. Because 
the low-MAC diet does not support full microbiota diversity (Fig. 2d), 
the fourth-generation FMT recipients were fed a high-MAC diet for 
2 weeks (Fig. 3a). Within 10 days, microbiota composition and diversity 
of the FMT group was indistinguishable from fourth-generation high- 
MAC-diet controls (P = 0.4, Student’s t-test; UniFrac distance; P= 0.4, 
Student's t-test, Shannon index) (Fig. 3b, c), and 110 taxa were restored 
(average abundance >1 sequencing read; no taxa restored in no FMT 
controls) (Fig. 3d and Supplementary Table 9). Restored taxa were pre- 
dominantly from the Bacteroidales (99 taxa), which experienced the 
greatest loss in high-confidence OT Us (Extended Data Fig. 7a). Similar 
results were observed using the high-confidence sub-OTUs (Extended 
Data Fig. 7b and Supplementary Table 10). 
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Figure 2 | Inefficient inter-generational transfer of taxa driven to low 
abundance by low dietary MACs. a, Schematic of multigeneration mouse 
experiment. Second- (n = 6), third- (n = 6) and fourth- (n =6) generation 
mice were weaned onto a low-MAC diet. After mice generated a litter of 
pups that were weaned, low-MAC-diet mice were switched to the high- 
MAC diet for 6 weeks. A parallel group of control mice were maintained 
on the high-MAC diet throughout (generation 2, n = 6; generation 3, n= 6; 
generation 4, n= 5). b, Microbiota diversity as measured by Shannon 
index observed in the microbiota of mice at 5 weeks old (top panel, n=6 
for each group) or 4 weeks after shift to high-MAC diet (bottom panel, 
n= 6 for each group) from three generations of diet-switching mice (grey) 
or control high-MAC-diet mice (black). Error bars are s.e.m, and P values 


collected from first-generation mice from the control group consuming a 
high-MAC diet (green, n = 5) or the diet-switching group from generation 
one (G1; yellow, n =5), two (G2; blue, n = 6), three (G3; red, n =6) and 
four (G4; purple, n =6). d, Heat map of abundance of high-confidence 
OTUs (number of sequencing reads, columns) from the diet-switching 
group (top) and controls (bottom); taxonomic assignment is indicated at 
the top of each column (Bacteroidetes, green; Firmicutes, orange; other, 
grey). Each row represents an individual mouse microbiota from 4 weeks 
post-humanization (initial), while consuming the low-MAC diet (week 9, 
lo, shaded yellow), and 4 weeks after switching to the high-MAC diet 
(week 15, hi, shaded grey). Corresponding time points from controls are 
similarly shaded. n =5, 6, 6 and 6 for the diet-switching group and n=5, 


are from two-tailed Student's t-test. c, Principal coordinate analysis of 
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6, 6 and 5 for the control group for generations one to four, respectively. 


Figure 3 | Reintroduction of lost taxa anda 
high-MAC diet restores microbiota diversity and 
composition. a, Schematic of faecal transplant 
mouse experiment. b, Principal coordinate analysis 
of UniFrac distance for 16S rRNA amplicon 
profiles from faecal samples collected from fourth- 
generation control mice on a high-MAC diet 
(green, n= 6), fourth-generation diet-switching 
mice that received a faecal transplant (red, n = 3), 
or did not (blue, n = 3). c, Microbiota diversity 

as measured by Shannon index observed in the 
microbiota of mice that received a faecal transplant 
(red, n= 3) or did not (blue, n= 3). A green circle 
denotes the number of OTUs observed in fourth- 
generation control mice consuming a high-MAC 
diet (n =6). Error bars are s.e.m. d, Heat map of 
abundance of high-confidence OTUs (number of 
sequencing reads) from fourth-generation diet- 
switching mice (n = 3) 3-14 days after FMT, and 
no-FMT controls (n = 3); taxonomic assignment is 
indicated at the top of each column (Bacteroidetes, 
green; Firmicutes, orange; other, grey). FMT donor 
(fourth-generation control mice, n =5) and fourth- 
generation diet-switching mice (n = 5) 4 weeks 
after consuming high-MAC diet are also shown. 


These data demonstrate a diet-induced ratcheting effect in which 
certain taxa decrease in abundance upon reduced MACs and are not 
effectively transferred to the next generation. Notably, most of the lost 
taxa are Bacteroidales, an order that is proficient in consumption of 
dietary fibre'®. Introduction of dietary MACs are insufficient to regain 
‘lost’ taxa in the absence of their deliberate re-introduction. 

Over our history, humans have experienced major dietary changes 
from gathered to farmed foods during the agricultural revolution, 
and more recently to the mass consumption of processed foods in the 
industrialized world. Each dietary shift was probably accompanied 
by a concomitant adjustment in the microbiota. Here we have used a 
model in which mice have been colonized with a human microbiota 
from a Westerner to determine the effect of fibre deprivation over four 
generations on the gut microbiota. This model does not allow us to 
address microbiota changes that may have occurred as humans shifted 
from a hunter-gatherer lifestyle to one from a modern industrialized 
country. Our data support a model in which consuming a modern 
diet low in fibre contributes to the loss of taxa over generations, and 
may be responsible for the lower-diversity microbiota observed in the 
industrialized world compared to present-day hunter-gatherers and 
rural agrarians. The data we present also hint that further deterioration 
of the Western microbiota is possible. 

The gut microbiota regulates numerous facets of human biology sug- 
gesting that our human genome has been shaped by interactions with 
these microorganisms over our co-evolutionary history. However, the 
microbiota can change on a timescale that is much faster than the host 
allowing for the possibility that the microbiota, if pressed by severe 
selective forces, could undergo change so rapidly that it cannot be 
accommodated by our human biology. While the roles of different types 
of microbiota diversity in host health remain to be defined, it is possi- 
ble that rewilding the modern microbiota with extinct species may be 
necessary to restore evolutionarily important functionality to our gut. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


No statistical methods were used to predetermine sample size. 

Meta-analysis of human populations. 16S rRNA data sets from Hadza (n = 27), 
Malawian and Venezuelan (n = 213) and American (n =315)** were trimmed 
to match FLX chemistry as previously described (MG-RAST Projects 111, 528, 
7058)!*. OTUs were picked on the Greengenes 13.8 database with a 97% similarity 
threshold using UCLUST”. Alpha- and beta-diversity measures were calculated 
using unweighted UniFrac'* on rarified OTU tables. Principal coordinates were 
computed using QIIME 1.8 (ref. 19). Faecal 16S rRNA data were not included 
from other studies of traditional populations because (1) they did not use the V4 
region of the 16S rRNA for amplification and were thus not comparable (Papua 
New Guinea’); (2) they have not made the data publicly available (Yanomami’); 
or (3) the control Western data was not comparable to other studies (Matses®). 
Mice. All mouse experiments were performed in accordance with A-PLAC, the 
Stanford IACUC on mixed gender germ-free Swiss Webster mice that were human- 
ized by oral gavage of human faecal sample obtained from a healthy anonymous 
donor (American male living in the San Francisco Bay Area, California, age 36, 
omnivorous diet) as previously described”’. Humanized mice closely reconsti- 
tuted the diversity and phylogenetic make-up of the donor (Pearson's r= 0.96; 
Extended Data Fig. 2). The investigators were not blinded to allocation during 
experiments and outcome assessment. Mice were randomly assigned to two groups 
and were fed either a high-MAC diet (LabDiet 5010) or a low-MAC diet (Harlan 
TD.86489). The high-MAC diet is a plant polysaccharide-rich diet in which the 
MACs come from a diverse source of plants including corn, soybean, wheat, oats, 
alfalfa and beet. The reported neutral detergent fibre content of the high-MAC diet 
is 15% by weight. The low-MAC diet is defined as a diet in which carbohydrates are 
from sucrose (31% by weight), corn starch (31% by weight) and cellulose (5% by 
weight). The accessibility of cellulose to gut microbiota is known to be extremely 
low and we have been unable to isolate bacteria from the microbiota that use 
this substrate’. Mice from the faecal transplant experiment were from the fourth 
generation of mice from the diet-switching group. Faecal transplants were carried 
out by gavage using freshly collected faecal samples from fourth-generation con- 
trol mice consuming a high-MAC diet using a procedure identical to the original 
humanization”!, Faecal samples were collected throughout all mouse experiments 
and stored at —80°C. 

16S rRNA amplicon sequencing and analysis. 16S rRNA amplicons were gener- 
ated for the v4 region from faecal pellets collected and the 16,878,145 Illumina gen- 
erated sequencing reads were analysed using Qiime 1.8 as described previously”. 
Sequencing data underwent quality filtering as described previously and data were 
rarefied to the lowest number of reads observed in a single sample (28,596 reads)”. 
OTUs were identified by open-reference picking using the UCLUST algorithm 
and taxonomy was assigned using the Greengenes 13.8 database. Plots of UniFrac 
distances are from unweighted analyses and UniFrac distance values are reported 
as within group versus between groups. Microbiota diversity was measured by 
Shannon index, which takes into account both overall richness and evenness. High- 
confidence OTUs were identified using the following criteria: present in at least 
three mice at the start of the experiment (4 weeks after humanization) and had a 


collective abundance of greater than 25 reads. High-confidence sub-OTUs were 
selected by first filtering out duplicate sequences whose distribution across samples 
is highly correlated (thresholded at 0.95 dynamical similarity). These duplicates 
correspond to the same bacterial populations, either as uncommonly frequent 
errors or as multiple 16S copies within the same bacterium. Second, for each exper- 
imental group all sequences whose initial raw abundance was at least 5 reads in at 
least 3 out of 5 mice in that isolator were selected as high-confidence sub-OTUs. 
Sub-OTU level analysis was performed as described previously’. 

Glycoside hydrolase profiles. This method was validated using samples for which 
16S rRNA profiles and metagenomic data were available (imputed glycoside hydro- 
lase profiles explained 84% of the annotated metagenomic data using a simple lin- 
ear fit without any model corrections; P= 10-*5)°. Glycoside hydrolase imputations 
were performed by annotating 2,746 reference genomes with glycoside hydrolase 
families using validated hidden Markov models many of which were derived from 
conserved domain database models that are capable of identifying several domains 
within a putative enzyme with increased sensitivity***. The glycoside hydrolases 
from taxonomically assigned communities were then calculated by applying a 
weighted average of the lowest taxonomic level with representative genomes. To 
account for the fact that many glycoside hydrolase families have wide ranging 
functions while each glycoside hydrolase subfamily may possess distinct function, 
we further clustered the annotated glycoside hydrolases into subfamilies as previ- 
ously described?*”56, Fold-changes of glycoside hydrolase family copy numbers 
were determined by comparing generation-four mice to the mean generation-one 
glycoside hydrolase profiles with significance testing performed across treatment 
and control groups. 
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Extended Data Figure 1 | Collating data from studies of the microbiota unweighted UniFrac PC1 versus age. d, Line plot of unique OTUs from 
of hunter-gatherers in Tanzania, agrarians from Malawi and Venezuela, _ faecal microbiota across populations (Americans, n = 315; Malawi and 
and Westerners from the United States reveals that Western populations Venezuela, n = 213; Tanzania, n = 27). OTUs (x axis; black, present; 


have depleted alpha-diversity from birth through childbearing years white, absent) are considered present if represented by >0.001% of reads 
and are missing bacterial taxa present in the traditional groups. within each population. OTUs were sorted along the x axis by their relative 
a, Scatterplot of faecal microbiota of individuals plotted by phylogenetic abundance in the US and Tanzanian populations and further subdivided 


diversity against age of the Hadza hunter-gatherers from Tanzania(n=16, _ by their distributions within a population into tracks (red >0.05%, 
green), agrarians from Malawi (n= 81, red) and Venezuela (n= 78, purple) yellow <0.05%, and green <0.01%, relative abundance). The opacity of 
and Americans (n= 213, blue) b, Individuals plotted by unweighted the line is the proportion of that population that meets the criteria for that 
UniFrac PC1 versus phylogenetic diversity. c, Individuals plotted by respective track. 
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Extended Data Figure 2 | Comparison of human donor and humanized — (n= 1). b, Alpha-diversity of the faecal microbiota from humanized mice 
mice. a, Taxa summary plot of the relative abundance of taxa from (mice) and human donor (human) expressed as number of OTUs (top) 
humanized mice faeces (mice) (n = 10) and human donor faeces (human) and phylogenetic diversity (bottom). Error bars are s.e.m. 
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Extended Data Figure 3 | Detailed schematic of multigeneration 
experiment. Generation one: humanized mice were fed a high-MAC diet 
for 4 weeks then switched to a low-MAC diet. One week after diet switch, 
the mice were bred to generate a litter of pups. After three additional 
weeks on the low-MAC diet, generation-two pups were born and remained 
in the cage with their mother, suckling for 3 weeks (generation one still 
consuming the low-MAC diet). After pups were weaned, generation-one 
mice were returned to the high-MAC diet for 6 weeks. Generation two: 
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pups were weaned from their mother at 3 weeks old onto a low-MAC diet, 
which they consumed for 10 weeks. Breeding pairs for generation-two 
mice were set-up at 7 weeks old. After three additional weeks on the low- 
MAC diet, generation-three pups were born and remained in the cage with 
their mother, suckling for 3 weeks (generation-two mice still consuming 
the low-MAC diet). After pups were weaned, generation-two mice were 
returned to the high-MAC diet for 6 weeks. Generations three and four 
followed the same protocol as generation two described above. 
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Extended Data Figure 4 | Microbiota diversity is not regained after 4 weeks (Gen 4 diet switching) (m= 5). Error bars are s.e.m. and P values 
direct weaning the diet-switching group onto the high-MAC diet. are from a two-tailed Student's t-test b, Principal coordinate analysis of 
a, Alpha-diversity as measured by Shannon index of faecal microbiota unweighted UniFrac distance for 16S rRNA amplicon profiles from faecal 
from generation-five mice from the high-MAC-diet control (control) samples collected from first-generation control mice on a high-MAC 
(n=6), generation-five diet-switching group that was weaned directly diet (green), fourth-generation diet-switching mice (purple), and fifth- 
onto the high-MAC diet (Gen 5 diet switching) (n = 6), and generation- generation mice from the diet-switching lineage weaned directly onto the 
four mice from the diet-switching group after weaning and maintenance high-MAC diet (orange). Control is plotted as weeks post-humanization 


on the low-MAC diet for 13 weeks and returned to the high-MAC diet for and generations four and five are plotted as age. 
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Extended Data Figure 5 | Fraction of high-confidence OTUs from 

the Clostridiales order increases and from the Bacteroidales order 
decreases over several generations in the low-MAC-consuming mice. 

a, Percentage of high-confidence OTUs, grouped by order, detected in 
mice faeces over four generations in the diet-switching lineage on the 
low-MAC diet (lo) and high-MAC diet (hi) (n=5 for Gen 1; n=6 for Gen 
2-4). b, Percentage of high-confidence OTUs, grouped by order, detected 
in mice faeces over four generations in the control high-MAC diet lineage 
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at the equivalent time points to the high-MAC diet (a) and low-MAC diet 
(b) of the diet-switching group (n=5 for Gen 1; n=6 for Gen 2-4). 

c, Imputed gycloside hydrolase (GH) family members that show 
significant differences (at least twofold change and P < 0.05, Bonferroni- 
corrected t-test) between generation-four diet-switching mice after 

4 weeks on the high-MAC diet (teal) (n =5) and the starting generation- 
one mice (salmon) (m= 10). Error bars depict s.e.m. No glycoside 
hydrolase families showed significant changes in the control group. 
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Extended Data Figure 6 | Inefficient inter-generational transfer of taxa post-humanization (initial), while consuming the low-MAC diet (week 9, lo, 
driven to low abundance by low dietary MACs. Heat map of abundance shaded yellow), and 4 weeks after switching to the high-MAC diet 
of high-confidence sub-OTUs (number of sequencing reads, columns) (week 15, hi, shaded grey). Corresponding time points from controls are 
from faeces of the diet-switching (top) and control (bottom) group. Each also shaded. Top row shows the taxonomic assignment for the OTUs plotted: 
row represents an individual mouse faecal microbiota from 4 weeks Bacteroidetes are green, Firmicutes are orange, and others are grey. 
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Extended Data Figure 7 | Reintroduction of lost taxa and a high-MAC 
diet restores microbiota diversity and composition with Clostridiales 
order decreasing and Bacteroidales order increasing in low-MAC- 
consuming mice that receive a faecal transplant. a, Plot of percentage 
representation of high-confidence OTUs from generation-four mice faeces 
in the diet-switching group at day 0 before the FMT (starting) (1 =6) and 
then 3-14 days no-FMT control (n = 3) or post-FMT (n=3). FMT donor 


is plotted on the right. b, Heat map of abundance of high-confidence 
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sub-OTUs (number of sequencing reads, columns) from the faeces of 

the diet-switching group at day 0 (Gen 4), days 3-14 that did not receive 
an FMT (control) (n =3 for each day), days 3-14 that received an FMT 
(+FMT), and the FMT donor. Each row represents an individual mouse 
faecal microbiota. Top row shows the taxonomic assignment for the OTUs 
plotted: Bacteroidetes are green, Firmicutes are orange, and others are 


grey. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Table 1 | Nutritional information of mouse diets 


: ; Protein Carbohydrates 
Diet Name Supplier/Product Name (% by weight)  (% by weight) 
High-MAC LabDiet 5010 25 50 
Low-MAC Harlan TD.86489* 18 62 


* Remaining 5% composed of vitamins and minerals 
** Exclusively from added cellulose 


Fat 

(% by weight) 
5 
5 
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Fibre 
(% by weight) 
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Neutral Detergent Fibre 
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Extended Data Table 2 | High-confidence OTUs at experiment start 


Taxonomy (# OTUs) (# OTUs) 
p__Bacteroidetes: c__Bacteroidia: o__Bacteroidales: f__[Odoribacteraceae]: g__Butyricimonas: s__ 0 1 
p__Bacteroidetes: c__Bacteroidia: o__Bacteroidales: f__Bacteroidaceae: g__ Bacteroides: s__ 49 48 
p__Bacteroidetes: c__Bacteroidia: o__Bacteroidales: f__Bacteroidaceae: g__Bacteroides: s__fragilis 2 2 
p__Bacteroidetes: c__Bacteroidia: o__Bacteroidales: f__Bacteroidaceae: g__ Bacteroides: s__ovatus 2 2 
p__Bacteroidetes: c__Bacteroidia: o__Bacteroidales: f__Bacteroidaceae: g__ Bacteroides: s__uniformis 5 6 
p__Bacteroidetes: c__Bacteroidia: o__Bacteroidales: f__Porphyromonadaceae: g__Parabacteroides: s__ 3 5 
p__Bacteroidetes: c__Bacteroidia: o__Bacteroidales: f__Porphyromonadaceae: g__Parabacteroides: s__distasonis 2 1 
p__Bacteroidetes: c__Bacteroidia: o__Bacteroidales: f__Porphyromonadaceae: g__Parabacteroides: s__ gordonii 1 2 
p__Bacteroidetes: c__Bacteroidia: o__Bacteroidales: f__Rikenellaceae: g__:s__ it 1 
p__Bacteroidetes: c__Bacteroidia: o__Bacteroidales: f__S24-7:g__:s__ 43 58 
p__Cyanobacteria: c__4C0d-2: 0__YS2:f_:g_:s 2 0 
p__Firmicutes: c__Clostridia: o__Clostridiales: f__:g__:s__ 3 3 
p__Firmicutes: c__Clostridia: o__Clostridiales: f__Lachnospiraceae: g__: s__ 22 17 
p__Firmicutes: c__Clostridia: o__Clostridiales: f__Lachnospiraceae: g__[Ruminococcus]: s__ 3 3 
p__Firmicutes: c__Clostridia: o__Clostridiales: f__Lachnospiraceae: g__[Ruminococcus]: s__gnavus 6 0 
p__Firmicutes: c__Clostridia: o__Clostridiales: f__Lachnospiraceae: g__Blautia: s__ 20 17 
p__Firmicutes: c__Clostridia: o__Clostridiales: f__Lachnospiraceae: g__Blautia: s__producta 5 3 
p__Firmicutes: c__Clostridia: o__Clostridiales: f__Lachnospiraceae: g__Coprococcus: s__ 6 4 
p__Firmicutes: c__Clostridia: o__Clostridiales: f__Lachnospiraceae: g__Dorea: s__ 4 3 
p__Firmicutes: c__Clostridia: o__Clostridiales: f__Ruminococcaceae: g__: s__ 6 9 
p__Firmicutes: c__Clostridia: o__Clostridiales: f__Ruminococcaceae: g__Faecalibacterium: s__prausnitzii 0 4 
p__Firmicutes: c__Clostridia: o__Clostridiales: f__Ruminococcaceae: g__Oscillospira: s__ 4 3 
p__Firmicutes: c__Clostridia: o__Clostridiales: f__Ruminococcaceae: g__Ruminococcus: s__ 5 3 
p__Firmicutes: c__Erysipelotrichi: o__Erysipelotrichales: f__Erysipelotrichaceae: g__: s__ 4 ‘A 
p__Firmicutes: c__Erysipelotrichi: o__Erysipelotrichales: f__Erysipelotrichaceae: g__[Eubacterium]: s__dolichum 2 2 
p__Firmicutes: c__Erysipelotrichi: o__Erysipelotrichales: f__Erysipelotrichaceae: g__Coprobacillus: s__ 1 1 
p__Firmicutes: c__Erysipelotrichi: o__Erysipelotrichales: f__Erysipelotrichaceae: g__Holdemania: s__ 1 0 
p__Proteobacteria: c__Alphaproteobacteria: o__RF32:f__:g__:s__ 2 2 
p__Proteobacteria: c__Betaproteobacteria: o__Burkholderiales: f__Alcaligenaceae: g__Sutterella: s__ 3 3 
p__Proteobacteria: c__Deltaproteobacteria: o__Desulfovibrionales: f__Desulfovibrionaceae: g__Bilophila: s__ 1 1 
p__Proteobacteria: c__Gammaproteobacteria: o__Enterobacteriales: f__Enterobacteriaceae: g__:s__ 1 0 
p__Verrucomicrobia: c__Verrucomicrobiae: o__Verrucomicrobiales: f__Verrucomicrobiaceae: g__Akkermansia: s__muciniphila 3 3 
Unassigned 1 0 
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Extended Data Table 3 | Shannon index of glycoside hydrolase subfamilies 


‘ Gen1 Gen1 Gen1 Gen1 Gen1 Gen4 Gen4 Gen4 Gen4 Gen4 significance 
Comparisons Mouse1 Mouse2 Mouse3 Mouse4 Mouse5 Mouse1 Mouse2 Mouse3 Mouse4 MouseS P value sign 
Diet-switching 9.39 9.42 9.34 9.35 9.40 9.23 9.21 9.12 9.26 9.17 2.2E-04 = 
Control 9.41 9.39 9.45 9.48 9.45 9.42 9.34 9.42 9.41 9.41 0.13 
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FOXOI1 couples metabolic activity and growth state 
in the vascular endothelium 


Kerstin Wilhelm!, Katharina Happel', Guy Eelen2*, Sandra Schoors”, Mark F. Oellerich!, Radiance Lim!, Barbara Zimmermann!, 
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Jens C. Briining’, Holger Gerhardt**!3, Peter Carmeliet?* & Michael Potente!45 


Endothelial cells (ECs) are plastic cells that can switch between 
growth states with different bioenergetic and biosynthetic 
requirements!. Although quiescent in most healthy tissues, ECs 
divide and migrate rapidly upon proangiogenic stimulation”. 
Adjusting endothelial metabolism to the growth state is central to 
normal vessel growth and function, yet it is poorly understood 
at the molecular level. Here we report that the forkhead box O 
(FOXO) transcription factor FOXO1 is an essential regulator of 
vascular growth that couples metabolic and proliferative activities 
in ECs. Endothelial-restricted deletion of FOXO1 in mice induces 
a profound increase in EC proliferation that interferes with 
coordinated sprouting, thereby causing hyperplasia and vessel 
enlargement. Conversely, forced expression of FOXO1 restricts 
vascular expansion and leads to vessel thinning and hypobranching. 
We find that FOXO1 acts as a gatekeeper of endothelial quiescence, 
which decelerates metabolic activity by reducing glycolysis and 
mitochondrial respiration. Mechanistically, FOXO1 suppresses 
signalling by MYC (also known as c-MYC), a powerful driver 
of anabolic metabolism and growth®®. MYC ablation impairs 
glycolysis, mitochondrial function and proliferation of ECs while 
its EC-specific overexpression fuels these processes. Moreover, 
restoration of MYC signalling in FOXO1-overexpressing 
endothelium normalizes metabolic activity and branching 
behaviour. Our findings identify FOXO1 as a critical rheostat of 
vascular expansion and define the FOXO1-MYC transcriptional 
network as a novel metabolic checkpoint during endothelial growth 
and proliferation. 

FOXO transcription factors are effectors of the phosphatidylinositol- 
3-OH kinase (PI(3)K)/AKT pathway that links growth and metabo- 
lism”*, PI(3)K signalling inhibits FOXOs through AKT-mediated 
phosphorylation leading to their nuclear exclusion®!”. We investigated 
the role of FOXO1 in ECs, an enriched FOXO family member in the 
endothelium!!-»». To this end, we bred floxed Foxo1 mice (Foxo L/)'6 
with a Tie2-cre deleter (Extended Data Fig. la), which recombines 
in endothelial and haematopoietic cells. Tie2-cre-mediated deletion 
of Foxol (Foxo1®©*°) caused defective vascular development and 
embryonic lethality around embryonic day (E)10.5 (Extended Data 
Fig. 1b, c)!”, suggesting that endothelial FOXO1 is essential for embryo 
development. 

Immunofluorescence analysis of developing blood vessels in the post- 
natal retina showed high levels of FOXO1 expression in the endothe- 
lium (Fig. la). Further examination of the subcellular distribution 


revealed a diffuse nucleocytoplasmic localization of FOXO1 at the 
angiogenic front, where most of the EC proliferation occurs, but a 
stronger nuclear pattern in the plexus, where vessels remodel, and 
endothelial proliferation abates (Fig. 1a). This spatial difference in sub- 
cellular localization suggests that FOXO1 is important for governing 
endothelial growth. To test this, we assessed the impact of Foxo1 dele- 
tion on retinal angiogenesis using the tamoxifen-inducible, endothelial- 
selective Pdgfb-creERT2 line (Foxo1'"°*°), Recombination was 
monitored with the Rosa26-mT/mG (mTmG) reporter that expresses 
green fluorescent protein (GFP) upon Cre-mediated recombina- 
tion. 4-hydroxy-tamoxifen (4-OHT) treatment resulted in broad 
GFP expression in ECs as well as extinction of endothelial FOXO1 
staining (Extended Data Fig. 1d-f). Endothelial loss of Foxo1 caused 
a dense and hyperplastic vasculature and resulted in the inability of 
ECs to extend proper sprouts (Fig. 1b-f). Instead, ECs grew in clus- 
ters leading to vessel enlargement and blunting of the angiogenic front 
(Fig. 1d, f). Strikingly, numerous filopodial bursts were emanating from 
the stunted front (Fig. 1c, d), suggesting that FOXO1 deficiency results 
in uncoordinated vascular growth. Staining Foxo1'®*° retinas for ERG 
(marking endothelial nuclei) and VE-cadherin (marking endothe- 
lial junctions), revealed an abundance of abnormally aligned ECs, 
which formed vessels with wide and irregular lumens (Extended Data 
Fig. 2a-c). Assessment of 5-bromodeoxyuridine (BrdU) incorporation 
and phospho-histone H3 (pHH3) labelling demonstrated a substan- 
tial increase in endothelial proliferation in the Foxo1'"©*° mutants 
(Fig. 1g, j and Extended Data Fig. 2d), indicating that deregulated pro- 
liferation drives this aberrant vessel phenotype. Importantly, the vas- 
cular defects of the Foxol'"~*° mice did not normalize at later stages of 
development, but showed a persistent increase in endothelial number, 
density and vessel diameter (Fig. 1h, iand Extended Data Fig. 2e-g). 
We conclude that FOXO1 is a suppressor of endothelial growth and 
proliferation, whose inactivation leads to uncontrolled overgrowth. 
Next, we determined the consequences of FOXO1 activation in ECs. 
We used a Cre-inducible gain-of-function allele (Foxo14) in which 
the AKT phosphorylation sites are mutated, thus rendering FOXO1 
constitutively nuclear (Extended Data Fig. 3a)'*. Tie2-cre-mediated 
expression of this IRES-GFP-coexpressing mutant (Foxo 17°) was 
incompatible with embryo survival beyond E10.5 (Fig. 2a), highlighting 
the sensitivity of ECs towards changes in FOXO1 status. We then used 
the Pdgfb-creERT2 strain to express Foxo1“ in the retinal endothe- 
lium (Foxo1'=““4), Immunofluorescence studies revealed an enriched 
FOXOl1 signal in endothelial nuclei and confirmed the EC-specific 
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Figure 1 | Endothelial FOXO1 is an essential regulator of vascular 
growth. a, Staining for FOXO1, VE-cadherin (VECAD) and isolectin-B4 
(IB4) in a postnatal day (P)5 mouse retina. The bottom panels depict the 
FOXO1 signal of the boxed area. Arrowheads point to ECs with weak 
FOXO1 nuclear staining. b, c, Overview (b) and higher magnification 

(c) images of IB4-stained retinal vessels at P5 in Foxo1'£*° and controls. 
A, artery; V, vein. d, Bar graphs showing endothelial area (n > 7), branch 
diameter (n > 7), and number of filopodia (n > 5). e, Images of IB4- and 
ERG-stained P5 retinas of control and Foxo1‘#©*° mutants. f, PECAM- 
and ERG-stained retinas showing endothelial clustering at the angiogenic 
front in Foxol‘®©*° mutants. g, Increased endothelial BrdU incorporation 
in Foxo1'#-*° retinas. h, i, Confocal images of ICAM2-, IB4- and 
collagen IV (COL)-stained retinas at P21. j, Quantifications of ERG/IB4- 
(n> 9), BrdU/IB4- (n > 5) and pHH3/IB4- (n > 7) positive cells. Data in 
d and j represent mean + standard deviation (s.d.), two-tailed unpaired 
t-test. ***P < 0.001; ****P< 0.0001. 


expression of GFP (Fig. 2b and Extended Data Fig. 3b, c, g). Forced 
activation of FOXO]1 led to a sparse and hyperpruned vascular network 
that contained fewer ECs (Fig. 2c, d, f-h and Extended Data Fig. 3g). 
These retinal vessels established a lumen but were thinner and extended 
fewer filopodia (Fig. 2g, h and Extended Data Fig. 3d). Staining for 
BrdU incorporation and pHH3 revealed a reduction in EC prolifer- 
ation in Foxo 18-4 mice while endothelial apoptosis was not altered 
(Fig. 2e, f, iand Extended Data Fig. 3e, g). Similar phenotypes were 
observed in the hindbrain vasculature (Fig. 2j, k and Extended Data 
Fig. 4a—c), indicating that FOXO1 is a critical driver of endothelial qui- 
escence. To examine this further, we analysed mosaic retinas of mImG- 
coexpressing control and Foxo1'#~ mice, in which the majority 
of ECs are unrecombined. Compared to controls, Foxol£#© mice 
showed an impaired propagation of GFP-positive ECs in the retinal 
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Figure 2 | Forced activation of FOXO1 restricts endothelial growth 

and vascular expansion. a, Overview images of control and Foxo12O-C4 
mice at E10.5. b, Staining for FOXO1, GFP and PECAM in PS Foxo1/2°-4 
and control mice. c-e, IB4 (c), ERG and IB4 (d), and pHH3 and IB4 

(e) labelling of P5 retinas in Foxo1'#“©4 and control mice. f, Quantification 
of vascular parameters in the control and mutant retinas as indicated 

(n> 5). g, Preserved luminal ICAM2 staining in Foxol#CCA mice. 

h, The number of empty (COL*, IB4-) sleeves (white arrows) in the retinal 
plexus is increased in the Foxol'#©4 mutants. i, No difference in cleaved 
caspase 3- (CASP3; green) positive ECs between control and Foxo1/23~4 
mice. j, Reduced vascularization of E11.5 hindbrains in Foxo1'#C4 

mice. k, Quantification of vascular parameters in control and Foxo L!ECCA 
hindbrains (n > 5). f, k, Data represent mean + s.d., two-tailed unpaired 
t-test. **P< 0.01; ***P< 0.001; ****P < 0.0001; NS, not significant. 


plexus (Extended Data Fig. 4d-f), arguing that the proquiescent activity 
of FOXO1 is cell autonomous. 

We next assessed whether FOXO1 regulates endothelial metabolism. 
Since ECs rely on glycolysis for vessel branching’, we first studied the 
effects of FOXO1 on this metabolic pathway. Transduction of human 
umbilical vein endothelial cells (HUVECs) with a FOXO1™-encoding 
adenovirus (AdFOXO1) led to a robust reduction in glycolysis as 
evidenced by a reduction in extracellular acidification rate (ECAR), 
glucose uptake, glycolytic flux and lactate production (Fig. 3a-d and 
Extended Data Fig. 5a). This metabolic phenotype correlates with the 
reduced proliferation in FOXO1°4-expressing ECs and raises a question 
as to whether FOXO1 promotes mitochondrial oxidative phosphoryl- 
ation. Surprisingly, FOXO1 did not stimulate but instead diminished 
oxidative metabolism as indicated by a decline in oxygen consumption 
in AdFOXO1“-expressing HUVECs (Fig. 3e). Moreover, reactive oxy- 
gen species (ROS) formation and ATP levels were decreased (Fig. 3fand 


14 JANUARY 2016 | VOL 529 | NATURE | 217 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


WAdCTL WAdCTL WAdCTL HAdCTL 
AdFOXO1°4 AdFOXxO1°4 AdFOXO1°A AdFOXO1°4 
250 a isa ig” 5,000 
= pred 4 baad c tek eas paises 
‘© 200 a5 a os ‘— 4,000 
E 8 a o' 
= 1504 % g2to 3210 @83 3,000 
3 os es soe 
= 100 as ou 2a 2000 
fu os 0.5 a= 0.5 56602 
< nO £oQ EE 
9 50 8 = 5 = € 1,000 
0 6 0 0 
Oligo - + 
é f WAdCTL & 
Oligo FCCP AAR 15 BAGFOXO1e g ~ of 
a 400 we oy = (a) < 
: igs ller on ge see ume © & 
E ao9 | BAdFOXON gs — ; 
wv 5 8 1.0 15 L_Lc3-! 
2 tke e) - ™|LC3-ll 
o 2007 - c= 
Om 4 — 
a Pay S405 75 Flag 
a 100 is a 
8 az 50 Fe ee ee ee ee] TUB 
: 0. 20. 40. 60 = 0 400+2 110+1NS Ratio (%) 
Time (min) 
<x 
. oO 
h jmadctL  k 5 
i AdFOXxo1°4 = 3 
FOXO1 motif MYC o iL 
15 3 3 
04 | ES 0.44 : m,(k) < 
0.3 NES 1.52 ba 
02! P< 0.001 5 me eee ee | MYC 
Qos | : 3 1.0 50 
Ww 0.1 = gs 
0 > 75-5 —— Flag 
-0.1 AdFoxo1eA = 505 
IMM : . a hate 
MTEC TRE TH] = ‘ 10042 4444** Ratio (%) 
Enriched in AAFOXO1°4 = 
= t 
MYC motif = I n BACCTL 
el 
= 4.0 AdFOXxo1°A 
= 
5 


= 


AdFOXO1°* 


mum 


WY A 


Enriched in AdCTL 


Figure 3 | FOXO1 slows endothelial metabolic activity and suppresses 
MYC signalling. a, ECAR in ECs treated with or without oligomycin 
(Oligo) showing reduced basal and maximal glycolytic activity in 
AdFOXO1%- compared to control adenovirus (AdCTL)-transduced 
HUVECs (n=6). b-d, Reduced 2-deoxy-p-glucose (2-DG) uptake 

(b; n= 13), relative lactate production (c; n = 10) and glycolytic flux 

(d; n =4) in FOXO1°4-expressing ECs. e, Oxygen consumption rates 
(OCR) in control and FOXO1°*-overexpressing ECs (1 =5) under basal 
conditions and in response to oligomycin, fluoro-carbonyl cyanide 
phenylhydrazone (FCCP) and antimycin A (AA)/rotenone (R). Data 
represent mean + s.d. Two-way analysis of variance (ANOVA) with 
Bonferroni’s multiple comparison test. f, Relative ROS levels in AdCTL- or 
AdFOXO1%-transduced ECs (n= 7). CM-DCE, 5-(and-6)-chloromethyl- 
2',7'-dichlorodihydrofluorescein diacetate. g, LC3 western blot analysis 


Extended Data Fig. 5b). Importantly, FOXO1 did not induce endothe- 
lial apoptosis, senescence, autophagy or energy distress under the 
same experimental conditions (Fig. 3g and Extended Data Fig. 5c-h). 
Together, our data indicate that FOXO1 adapts metabolic activity to 
the lower requirements of the quiescent endothelium. 

To gain insight into the underlying mechanisms for this adapt- 
ability, we performed transcriptome analysis of FOXO1°4- and 
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Fold regulation 


showing that overexpression of the Flag-tagged FOXO1 does not 
induce autophagy in ECs. CQ, chloroquine; TUB, tubulin. Densitometric 
quantifications are shown below the lanes (n = 10). h, GSEA of the 
FOXO1 (AAACAA) or MYC (CACGTG) DNA-binding element gene 

sets in AAFOXO1- or AdCTL-transduced ECs. ES, enrichment score; 
NES, normalized enrichment score. i, Heat map of downregulated MYC 
signature genes in FOXO1°4-overexpressing ECs (n= 3). j,k, Analysis of 
MYC expression by microarray (j) and immunoblot (k) in FOXO1°4-Flag- 
overexpressing endothelium. j, n = 6; k, n= 10.1, Quantitative polymerase 
chain reaction (qPCR) expression analysis of FOXO1°4-regulated 

genes involved in MYC signalling (n > 3). a-d, f, g, j-I, Data represent 
mean + s.d., two-tailed unpaired t-test. **P < 0.01; ****P < 0.0001; 

+P < 00.1; NS, not significant. 


GFP-transduced HUVECs. Gene set enrichment analysis (GSEA) 
revealed an enrichment of the FOXO1 DNA-binding elements in 
genes induced by FOXO1, while the MYC DNA-binding motif was 
highly enriched in the repressed genes (Fig. 3h and Extended Data Fig. 
6a, b). Moreover, MYC target gene signatures were downregulated in 
the FOXO1 transcriptome (Fig. 3i and Extended Data Fig. 6c). Since 
MYCis a powerful driver of glycolysis, mitochondrial metabolism and 


© 2016 Macmillan Publishers Limited. All rights reserved 


a siSCR b siSCR c 
eee maiMye maimye Control MycEC-KO 
= phi) < 2004 ** 2 
£ 150 = < — 9 
= E cm 
= ee Sy 180 
Qa mo pach 
= 100 3 100 
fa & 
g 50 x 50 i) 
o 8 = q 
0 0 
Oligo - + 
d Control f Control MyciEC-OF 


ERG IB4 


100 um 


Foxo 1iEC-CA-yciEC-OE 


MyciEC-OF 


Oligo FCCP AAR 


a MAdcTt 

4 [Ml AdFoxo1c 
300 4} fMAdFoxo1%/MYC | 
Gaamyc 


EC area per field (%) 


i Control root oyyece ~ * o oe é 40 a 
Brooreeo Egeeee MAIC Bsroxorenie nmin 
Figure 4 | MYC is a critical component of FOXO1 signalling in ECs. 

a, b, ECAR (a) and OCR (b) in MYC siRNA- (siMYC) or scrambled 
siRNA- (siSCR) transfected ECs (ECAR: n=5; OCR: n=5). Data 
represent mean + s.d., two-tailed unpaired t-test. c, Staining for MYC, 
VECAD and PECAM in retinas of Myc!#“*° and control mice. 

d, e, Images of IB4- (d) and IB4- and ERG- (e) stained P5 retinas of 
control and MyclBCKO mice. f, g, Images of IB4- (f) and pHH3- and 

IB4- (g) stained P5 retinal vessels in Myc!“ and control mice. 

h, i, Representative images (h) and quantification (i) of IB4-stained P5 
retinas in control, Foxol'®#°, Mycl#C-°F and Foxo1!2~°4/Myci®°F double 
mutants (n> 6). j, k, ECAR (j) and OCR (k) in AdCTL, AdFOXO1™, 
AdFOXO1°/AdMYC and AdMYC-transduced HUVECs showing 
restoration of metabolic activity in FOXO1°4/MYC co-expressing ECs 
(ECAR: n= 8; OCR: n> 3). i-k, Data represent mean + s.d., one-way 
ANOVA with Bonferroni’s multiple comparison post-hoc test. *P < 0.05; 
** P< 0.01; ***P < 0.001; ****P < 0.0001. 


growth°, FOXO1 might antagonize endothelial MYC signalling. In line 
with this, overexpression of FOXO1° suppressed MYC expression, 
whereas FOXO1 depletion enhanced MYC levels, both in HUVECs 
and in ECs derived from the mutant mice (Fig. 3j-l and Extended Data 
Fig. 6d-h). Immunofluorescence studies in Foxo1'* retinas con- 
firmed these findings and showed a suppression of endothelial MYC 
expression (Extended Data Fig. 3f). Accordingly, numerous genes that 
are induced by MYC were downregulated in FOXO1°4-overexpressing 
HUVECs, including genes involved in cell metabolism and cell cycle 
progression (Fig. 3i, 1). This regulation is in line with the repression 
of MYC by FOXOs in cancer cells!**3 and points to MYC as a crucial 
effector of FOXO1 in the coordination of endothelial metabolism and 
growth. 

Remarkably, FOXO1 also induced the expression of negative reg- 
ulators of MYC signalling including MXIJ/, an antagonist of MYC 
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transcriptional activity’, and FBXW7, an E3 ubiquitin ligase that targets 
MYC for proteasomal degradation* (Fig. 31 and Extended Data Fig. 7a). 
Consistent with these findings, MYC protein stability was decreased 
in FOXO1“-expressing ECs and co-treatment with the proteasomal 
inhibitor MG132 partially restored MYC protein levels (Extended Data 
Fig. 7b, c). Knockdown of MXI1, on the other hand, did not affect 
the FOXO1-induced repression of MYC but attenuated the ability of 
FOXO1 to downregulate MYC target genes (Extended Data Fig. 7d-f). 
These data are in accordance with the function of MXI1 as a negative 
regulator of MYC activity®!”°, and suggest that FOXO1 intersects with 
MYC signalling at different levels. 

To explore further the role of MYC in ECs, we profiled the tran- 
scriptome of MYC short interfering RNA (siRNA)-transfected 
(siMYC) HUVECs. This analysis showed a suppression of MYC 
signature genes involved in metabolic processes and cell cycle pro- 
gression and validated the regulation of predicted MYC targets that 
were repressed by FOXO1 (Extended Data Fig. 8a—g). Bioenergetic 
analysis revealed that MYC deficiency attenuated glycolysis and 
mitochondrial respiration, whereas adenoviral overexpression of 
MYC (AdMYC) stimulated these metabolic activities (Fig. 4a, b and 
Extended Data Fig. 9e). Conditional deletion of Myc (MycM")*4 in 
mice using the Pdgfb-creERT2 deleter impaired vascular expansion 
and led to a thinned and poorly branched vasculature with reduced 
EC proliferation (Fig. 4c-e and Extended Data Fig. 9a-d). These 
phenotypes resemble the vascular defects in Foxo 1/8“ mutant mice 
and imply that MYC is a central component of endothelial FOXO1 
signalling. To test this directly, we attempted to rescue the endothe- 
lial phenotypes imposed by FOXO1 activation by restoring MYC 
signalling with a Cre-inducible Myc overexpressor allele (Myc“)”°. 
Pdgfb-creERT2-induced overexpression of MYC caused sustained 
vascular overgrowth and led to a profound increase in EC number, 
proliferation and vessel density (Fig. 4f, g and Extended Data Fig. 
9f-k). We then combined the Myc4, Foxo1©4 and Pdgfb-creERT2 
alleles to generate endothelial-specific double mutants. Remarkably, 
re-expression of MYC in ECs of Foxo1l'#~4 mice normalized 
the hypobranched and hypocellular vascular phenotype caused 
by FOXO1 activation (Fig. 4h, i and Extended Data Fig. 10a, b). 
Moreover, coexpression of MYC and F OXO1“ in HUVECs restored 
glycolysis, mitochondrial respiration and ROS formation (Fig. 4j, k 
and Extended Data Fig. 10c), indicating that regulation of MYC 
signalling by FOXO1 is critical for the coordination of endothelial 
metabolism and growth. 

This study identifies FOXO1 as a critical checkpoint of endothe- 
lial growth that restricts vascular expansion. Our data suggest that 
FOXO1 promotes endothelial quiescence by antagonizing MYC, 
which leads to a coordinated reduction in the proliferative and met- 
abolic activity of ECs. The FOXO1-induced deceleration of metabolic 
activity might not only enforce quiescence but also support endothe- 
lial function. For instance, by lowering metabolism, ECs will con- 
sume less energetic fuel for their homeostatic needs, thereby ensuring 
efficient nutrient and oxygen delivery. Reducing metabolic activity 
might also contribute to endothelial redox balance. ECs are long-lived 
cells that need to protect themselves against oxidative damage exerted 
by high oxygen levels in the bloodstream. The FOXO1-induced 
reduction in oxidative metabolism might thus be a mechanism to 
minimize the production of mitochondria-derived ROS, thereby 
conferring protection against the high-oxygen environment. Such 
a role of FOXO1 in endothelial metabolism aligns with the broader 
function of FOXOs in mediating oxidative stress resistance?> °°, and 
might also explain why ECs are exquisitely sensitive to a change in 
FOXO1 status. It will be interesting to determine how endothelial 
FOXO1 is regulated in vivo and how deregulation contributes to 
disease. 

Online Content Methods, along with any additional Extended Data display items and 


Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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Animals and genetic experiments. All conditional Foxo1- and Myc-mutant mice 
were on a C57BL/6 genetic background and generated as described'®'*4, For 
constitutive Cre-mediated recombination in ECs, Foxo Ul"! or Rosa26-Foxo1@4 
mice were bred with Tie2-cre transgenic mice*’. To avoid recombination in the 
female germline, only Tie2-cre-positive male mice were used for intercrossing. 
Embryos were collected from cre-negative females at the indicated time points and 
genotyping was performed from isolated yolk sacs. For inducible Cre-mediated 
recombination in ECs, floxed mice were bred with transgenic mice expressing the 
tamoxifen-inducible, Pdgfb promoter-driven creERT2 recombinase*. The degree 
of Cre-mediated recombination was assessed with the double-fluorescent Cre- 
reporter Rosa26-mT/mG* allele, which was crossed into the respective mutant 
mice. For the analysis of angiogenesis in the postnatal mouse retina, Cre-mediated 
recombination was induced in newborn mice by intraperitoneal (i-p.) injections 
of 2541] 4-hydroxy-tamoxifen (4-OHT; 2 mg ml‘; Sigma-Adrich) from postnatal 
day (P)1 to P4. Eyes were harvested at P5 or P21 for further analysis. In mosaic 
recombination experiments, 4-OHT (20,1g~! body weight of 0.02 mg ml) was 
injected ip. at P3 and eyes were collected at P5. To induce Cre-mediated recom- 
bination in mouse embryos, 10011 of 4-OHT (10mg ml ') was injected i.p. into 
pregnant females from embryonic day (E)8.5 to E10.5. Embryos were harvested 
at E11.5 for the analysis of angiogenesis in the embryonic hindbrain. The Rosa26- 
Foxo1™, Rosa26-Myc and Rosa26-mT/mG alleles were kept heterozygous for the 
respective transgene in all experimental studies. Apart from the mosaic studies, 
control animals were littermate animals without cre expression. Male and 
female mice were used for the analysis, which were maintained under specific 
pathogen-free conditions. Experiments involving animals were conducted in 
accordance with institutional guidelines and laws, following protocols approved by 
local animal ethics committees and authorities (Regierungspraesidium Darmstadt). 
Immunohistochemistry of mouse tissues. To analyse blood vessel growth in the 
postnatal retina, whole mouse eyes were fixed in 4% paraformaldehyde (PFA) on 
ice for 1h. Eyes were washed in PBS before the retinas were dissected and partially 
cut into four leaflets. After blocking/permeabilization in 2% goat serum (Vector 
Laboratories), 1% BSA and 0.5% Triton X-100 (in PBS) for 1h at room tempera- 
ture, the retinas were incubated at 4°C overnight in incubation buffer containing 
1% goat serum, 0.5% BSA and 0.25% Triton X-100 (in PBS) and the primary anti- 
body. Primary antibodies against the following proteins were used: cleaved caspase 
3 (Cell Signaling Technology, #9664, 1:100), collagen IV (AbD Serotec, #2150-1470, 
1:400), ERG 1/2/3 (Abcam, #ab92513, 1:200), FOXO1 (Cell Signaling Technology, 
#2880, 1:100), GFP (Invitrogen, #A21311, 1:100), ICAM2 (BD Biosciences, 
#553326, 1:200), MYC (Millipore, 06-340, 1:100), PECAM-1 (R&D Systems, 
AF3628, 1:400), phospho-histone H3 (Chemicon, #06-570, 1:100), TER119 (BD 
Biosciences, #553670, 1:100), and VE-cadherin (BD Biosciences, #555289, 1:25). 
After four washes with 0.1% Triton X-100 in PBS (PBST), retinas were incubated 
with Alexa-Fluor 488-, Alexa-Fluor 555- or Alexa-Fluor 647-conjugated secondary 
antibodies (Invitrogen, 1:400) for 2h at room temperature. For staining ECs with 
isolectin B4 (IB4), retinas were washed with PBLEC buffer (1 mM CaCl, 1 mM 
MgCl, 1mM MnCl, and 1% Triton X-100 in PBS) and incubated with biotinylated 
IB4 (Griffonia simplicifolia, #B1205, Vector Laboratories, 1:100) diluted in PBLEC 
buffer. After washing, retinas were incubated in Alexa-Fluor-coupled streptavidin 
(Invitrogen, #821374, 1:200) for 2h at room temperature. For nuclear counterstain, 
retinas were incubated with 4’,6-diamidino-2-phenylindole (DAPI; Sigma Aldrich, 
#D9542, 1:1,000) for 15 min following washes with PBST and PBS. The labelling 
of proliferating cells with BrdU was performed in P5 pups. In brief, 50 mgkg ! of 
BrdU (Invitrogen, #B23151) per pup was injected i.p. 3h before they were killed. 
Retinas were fixed for 2h in 4% PFA and then incubated for 1 h in 65°C warm for- 
mamide, followed by an incubation of 30 min in 2 N HCL. Afterwards retinas were 
washed twice with 0.1 M Tris-HCl (pH 8) and then blocked in 1% BSA, 0.5% Tween 
20 in PBS and incubated overnight at 4°C with a mouse anti-BrdU antibody (BD 
Biosciences, #347580, 1:50). The detection was performed with Alexa-Fluor-488 
anti-mouse secondary antibody (Invitrogen, A21202, 1:400). After the BrdU stain- 
ing, retinas were processed for the IB4 staining as described earlier. The dissection 
of the embryonic hindbrain was performed as described™. After overnight fixation 
in 4% PFA, dissected hindbrains were incubated in a blocking solution containing 
10% serum, 1% BSA and 0.5% Triton X-100 in PBS at 4°C. After washes with PBS, 
hindbrains were incubated for 1 h in PBLEC buffer before the overnight incubation 
with Alexa-Fluor-conjugated IB4 (Invitrogen, #121411, 1:100 in PBLEC) at 4°C. 
Hindbrains were washed with PBS and stained with DAPI. Retinas and embryonic 
hindbrains were flat-mounted with Vectashield (Vector Laboratories) and exam- 
ined by confocal laser microscopy (Leica TCS SP5 or SP8). Immunostainings were 
carried out in tissues from littermates and processed under the same conditions. 

Immunohistochemistry of cell cultures. HUVECs were seeded on glass- 
bottom culture dishes (Mattek) and cultured at 37 °C and 5% CO;. To detect auto- 
phagy, cells were washed and fixed with 4% PFA for 20 min at room temperature. 
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Permeabilization was performed in 1% BSA, 10% donkey serum and 0.5% Tween- 
20 in PBS. Cells were stained for anti-LC3A/B (Cell Signaling Technology, #12741, 
1:400), Phalloidin-TRITC (Sigma Aldrich, #P1951, 1:500) and DAPI in incubation 
buffer (0.5% BSA, 5% donkey serum and 0.25% Tween-20 in PBS). After washes 
with PBST, samples were incubated with Alexa-Fluor-conjugated secondary anti- 
bodies (Invitrogen, 1:200). Cells were washed and mounted in VectaShield. As a 
positive control, HUVECs were treated with 501M chloroquine overnight before 
fixation. 

Image acquisition and processing. Stained tissue/cells were analysed at high res- 
olution with a TCS SP8 confocal microscope (Leica). Volocity (Perkin Elmer), Fiji/ 
ImageJ, Photoshop (Adobe) and Adobe Illustrator (Adobe) software were used 
for image acquisition and processing. For all of the images in which the levels of 
immunostaining were compared, settings for laser excitation and confocal scanner 
detection were kept constant between groups. 

Quantitative analysis of the retinal and hindbrain vasculature. All quantifica- 
tions were done on high-resolution confocal images of thin z-sections of the sample 
using the Volocity (Perkin Elmer) software. In the retina, endothelial coverage, the 
number of endothelial branchpoints, and the average vessel branch diameter were 
quantified behind the angiogenic front in a region between an artery and a vein. 
In the embryonic hindbrain, randomly chosen fields were used to quantify the 
vascularization in the ventricular zone. All parameters were quantified in a mini- 
mum of four vascularized fields per sample. Endothelial coverage was determined 
by assessing the ratio of the IB4-positive area to the total area of the vascularized 
field (sized 200 |1m x 200j1m), and expressed as a percentage of the area covered by 
IB4-positive ECs. Average vessel diameter was analysed by assessing the diameter 
of individual vessel branches in a vascularized field (sized 200 |1m x 200|1m), which 
was used to calculate the mean diameter in each field. The diameter of individual 
vessel branches was averaged from three measurements taken at the proximal, 
middle and distal part of the vessel segment. The number of filopodial extensions was 
quantified at the angiogenic front. The total number of filopodia was normalized 
to a vessel length of 100 1m at the angiogenic front, which was defined and meas- 
ured according to published protocols*’. For quantifying vascular outgrowth in the 
mouse retina, the distance of vessel growth from the centre of the optic nerve to 
the periphery was measured in each leaflet of a dissected retina, which was used to 
calculate the mean value for each sample. The number of ERG/IB4- and BrdU/IB4- 
labelled cells was counted in at least four fields sized 200|1m x 200|1m per sample. 
Because of the lower incidence of pHH3-positive ECs, the number of pHH3/IB4- 
double-positive cells was quantified in larger fields (sized 580 |1m x 580j1m). For 
the quantification of the mosaic control (Pdgfb-creERT2;Rosa26-Foxo1*! +Rosa26- 
mTImG"*) and Foxol®A (Pdgfb-creERT2;Rosa26-Foxo1°4!+;Rosa26-mTmG"*) 
retinas, the GFP/IB4 double-positive area per field was determined and divided 
by the total IB4-positive area. The percentage of the GFP/IB4 double-positive area 
per total IB4 area was measured in four fields (400j1m x 400,1m) per sample and 
used to calculate the mean value. For the quantification of nuclear FOXO1 expres- 
sion in control and Foxol'®~ mice, high-resolution confocal images were taken 
with a x40 objective. The resulting images were analysed with the Bitplane Imaris 
software. Vessels were first segmented using the Surface module in Imaris. FOXO1 
immunofluorescence was then used to set a threshold in the new vascular surface 
area, in which only CD31-positive nuclei were selected (Surface module). The 
sum intensity of the nuclear FOXO1 fluorescence was divided by the total vascular 
area to adjust for differences in vascular density on each image. An average of six 
images per sample was quantified in three animals per group. All of the images 
shown are representative of the vascular phenotype observed in samples from at 
least two distinct litters per group. 

Cell culture. Pooled HUVECs were purchased from Lonza and authenticated by 
marker expression (CD31/CD105 double-positive) and morphology. HUVECs 
were cultured in endothelial basal medium (EBM; Lonza) supplemented with 
hydrocortisone (1 1g ml~!), bovine brain extract (12 Lg ml}), gentamicin 
(50,gml~'), amphotericin B (50ng ml~), epidermal growth factor (10ng ml!) 
and 10% fetal bovine serum (FBS; Life Technologies). HUVECs were tested 
negative for mycoplasma and cultured until the fourth passage. The isolation of 
mouse lung ECs was performed as described*®. In brief, adult mice were killed, 
lungs were removed and incubated with dispase. The homogenate was filtered 
through a cell strainer, collected by centrifugation, and washed with PBS con- 
taining 0.1% BSA (PBSB). The resulting cell suspension was incubated with rat 
anti-mouse VE-cadherin antibody- (BD Pharmingen, #555289) coated magnetic 
beads (Dynabeads, Invitrogen, #11035). Next, the beads were washed with PBSB 
and then resuspended in DMEM/F12 (Invitrogen) supplemented with 20% FCS, 
endothelial growth factor (Promocell, #C-30140), penicillin and streptomycin. The 
isolated cells were seeded on gelatin-coated culture dishes and re-purified with the 
VE-cadherin antibody during the first three passages. 

Adenoviral infection. Sub-confluent HUVECs were infected with adenoviruses 
to overexpress constitutively active homan FOXO1-Flag (FOXO1°)*”, human 
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c-MYC-HA*® (Vector Biolabs) and GFP or LacZ as a control. HUVECs (70-80% 
confluent) were incubated in EBM containing 0.1% BSA for 4h. Prior to infection, 
adenoviruses were incubated with an antennapedia-derived peptide (Eurogentec) 
to facilitate the infection. The mixture was then applied to the HUVECs cultured in 
EBM containing 0.1% BSA and incubated for 4h. Thereafter, the cells were washed 
five times and cultured in EBM with 10% FCS and supplements. The adenoviral 
infection of murine ECs was performed with adenoviruses encoding for Cre or 
GFP (Vector Biolabs) as a control. 

RNA interference. To silence FOXO1, MYC or MXI1 gene expression, HUVECs 
were transfected with a pool of siRNA duplexes directed against haman FOXO1, 
human c-MYC or human MX11 (ON-TARGETplus SMARTpool, Dharmacon). A 
negative control pool of four siRNAs designed and microarray-tested for minimal 
targeting of human, mouse or rat genes was used as a control (ON-TARGETplus 
Non-targeting pool, Dharmacon). HUVECs were transfected with 50nM of the 
indicated siRNAs using Lipofectamine RNAiMAX (Invitrogen) according to the 
manufacturer's recommendations. 

Microarray and gene set enrichment analysis. Total RNA quality was verified 
using the Agilent Bioanalyser and the 6000 nano kit. RNA was labelled according 
to the Affymetrix Whole Transcript Sense Target Labelling protocol. Affymetrix 
GeneChip Human Gene 1.0 ST arrays were hybridized, processed and scanned 
using the appropriate Affymetrix protocols. Data were analysed using the 
Affymetrix expression console using the RMA algorithm, statistical analysis was 
done using DNAStar Arraystar 11. Heat maps were generated using GENE-E, 
publicly available from the Broad Institute (http://www.broadinstitute.org/cancer/ 
software/GENE-E/). For gene set enrichment analysis (GSEA), gene set collections 
from the Molecular Signatures Database (MSigDB) 4.0 (http://www.broadinstitute. 
org/gsea/msigdb/) were used for the analysis of the endothelial FOXO1 and MYC 
transcriptomes. 

qRT-PCR. RNA was extracted from cells using the RNeasy Mini Kit (Qiagen) 
according to the manufacturer’s instructions. cDNA synthesis was performed 
on 21g of total RNA using the M-MLV reverse transcriptase (Invitrogen). qPCR 
was performed with TaqMan Gene Expression Master Mix (Applied Biosystems) 
and TaqMan probes (TaqMan Gene Expression Assays) available from Applied 
Biosystems. TaqMan Gene Expression Assays used were as follows: human 
ACTB Hs99999903_m1; CCNB2 Hs00270424_m1; CCND1 Hs00765553_m1; 
CCND2 Hs00153380_m1; CDK4 Hs00262861_m1; c-MYC Hs00153408_m1; 
ENO1 Hs00361415_m1; FASN Hs01005622_m1; FBXW7 Hs00217794_m1; 
FOXO1 Hs01054576_m1; LDHA Hs00855332_g1; LDHB Hs00929956_m1; 
MXI11 Hs00365651_m1; PKM2 Hs00987254_m1. Mouse probes were: Actb 
Mm 00607939_s1; Myc Mm00487804_m1. All qPCR reactions were run on 
a StepOnePlus real-time PCR instrument (Applied Biosystems) and data were 
calculated using the AAC, method. 

Western blot analysis and antibodies. Western blot analyses were performed 
with precast gradient gels (Bio-Rad) using standard methods. Briefly, HUVECs 
were lysed in RIPA buffer (150 mM NaCl, 1.0% IGEPAL CA-630, 0.5% sodium 
deoxycholate, 0.1% SDS and 50 mM Tris, pH 8.0) supplemented with a protease 
inhibitor mix (Complete Mini Protease Inhibitor cocktail tablets, Roche) and 
phenylmethylsulfonyl fluoride. Proteins were separated by SDS-PAGE and blotted 
onto nitrocellulose membranes (Bio-Rad). Membranes were probed with specific 
primary antibodies and then with peroxidase-conjugated secondary antibodies. 
The following antibodies were used: AMPKa (Cell Signaling Technology, #2532, 
1:1,000), caspase 3 (Cell Signaling Technology, #9662, 1:1,000), cleaved caspase 
3 (Asp175) (Cell Signaling Technology, #9664, 1:1,000), cleaved PARP (Cell 
Signaling Technology, #5625, 1:1,000), c-MYC (Cell Signaling Technology, #9402, 
1:1,000), FBXW7 (Abcam, #12292, 1:500), Flag M2 (Sigma, #F-3165, 1:1,000), 
FOXO1 (Cell Signaling Technology, #2880, 1:1,000), HA (Covance, clone 16B12, 
MMS-101P, 1:1,000), LC3A/B (Cell Signaling Technology, #12741, 1:1,000), MXI1 
(Santa Cruz, SC-1042, 1:500), P-ACC (Cell Signaling Technology, #3661, 1:1,000), 
P-AMPKa (Thr 172) (Cell Signaling Technology, #2535, 1:1,000), PARP (Cell 
Signaling Technology, #9532, 1:1,000), Tubulin (Cell Signaling Technology, #2148, 
1:1,000). The bands were visualized by chemiluminescence using an ECL detec- 
tion kit (Clarity Western ECL Substrate, Bio-Rad) and a ChemiDoc MP Imaging 
System (Bio-Rad). The gel source data of the western blot analysis is illustrated 
in Supplementary Fig. 1. Quantification of band intensities by densitometry was 
carried out using the Image Lab software (Bio-Rad). 

Metabolic assays. Extracellular acidification (ECAR) and oxygen consump- 
tion (OCR) rates were measured using the Seahorse XFe96 analyser (Seahorse 
Bioscience) following the manufacturer's protocols. Briefly, ECAR and OCR were 
measured 4h after seeding HUVECs (40,000 cells per well) on fibronectin-coated 
XFe96 microplates. HUVECs were maintained in non-buffered assay medium in a 
non-CO) incubator for 1h before the assay. The Glycolysis stress test kit (Seahorse 
Bioscience) was used to monitor the extracellular acidification rate under various 
conditions. Three baseline recordings were made, followed by sequential injection 


of glucose (10 mM), the mitochondrial/ATP synthase inhibitor oligomycin (31M), 
and the glycolysis inhibitor 2-deoxy-p-glucose (2-DG; 100 mM). The Mito stress 
test kit was used to assay the mitochondrial respiration rate under basal conditions, 
in the presence of the ATP synthase inhibitor oligomycin (3 1M), the mitochon- 
drial uncoupler carbonyl cyanide-4-(trifluoromethoxy)phenyl-hydrazone (FCCP; 
141M), and the respiratory chain inhibitors antimycin A (1.5,1.M) and rotenone 
(31M). To measure glycolysis in ECs, HUVECs were incubated for 2h in growth 
medium containing 80 jtCi mmol“! [5-*H]-p-glucose (Perkin Elmer). Thereafter, 
supernatant was transferred into glass vials sealed with rubber stoppers. *H2O was 
captured in hanging wells containing a Whatman paper soaked with H,O over 
a period of 48h at 37°C to reach saturation*. Radioactivity was determined by 
liquid scintillation counting and normalized to protein content. Lactate concen- 
tration in the HUVEC culture media was measured by using a Lactate Assay Kit 
(Biovision) following the instructions of the manufacturer. Glucose uptake was 
assessed by analysing the uptake of 2-DG with a Colorimetric Assay (BioVision). 
ATP was measured from lysates from HUVECs (1 x 10° per ml) with an ATP 
Bioluminescence Assay Kit CLS II (Roche) according to the instructions of the 
manufacturer. 

Intracellular ROS measurement. Intracellular ROS levels were determined 
using CM-H,DCFDA dye (Life technologies). Dye was reconstituted in DMSO 
(10 mM) and diluted 1:1,000 in PBS containing CaCl, and MagCl, as working 
solution. Twenty-four hours after transduction, 1 x 10° cells were incubated in 1 ml 
working solution for 40 min at 37°C in the dark. Subsequently the fluorescence 
of 10,000 living endothelial cells per sample was measured at the BD FACS LSR 
II flow cytometer. The assays were performed with adenoviruses, which did not 
co-express fluorescent reporter genes. Data were analysed using BD FACSDiva 
software (version 8.0.1). 

Senescence-associated (3-galactosidase staining. To detect senescence-associated 
8-galactosidase activity in HUVECs, a cellular senescence assay kit (#KAA002, 
Chemicon) was used according to the manufacturer’s instructions. Briefly, cells 
were fixed in 1 ml fixing solution at room temperature for 15 min. Two millilitres 
of freshly prepared SA-8-gal detection solution was added and cells were incu- 
bated overnight at 37°C without CO, and protected from light. Then the detection 
solution was removed and cells were washed and mounted in 70% glycerol in PBS. 
H,O>-treated HUVECs were used as a positive control. 

Statistical analysis. Statistical analysis was performed by unpaired, two-tailed 
Student's t-test, or non-parametric one-way ANOVA followed by Bonferroni's 
multiple comparison test unless mentioned otherwise. For all bar graphs, data are 
represented as mean +s.d. P values < 0.05 were considered significant. All calcu- 
lations were performed using GraphPad Prism software. No randomization or 
blinding was used and no animals were excluded from the analysis. Sample sizes 
were selected on the basis of published protocols** and previous experiments. 
Several independent experiments were performed to guarantee reproducibility 
and robustness of findings. 
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Extended Data Figure 1 | Constitutive and inducible deletion of Foxol lanes 2 and 4) pups untreated (lanes 1 and 2) or treated (lanes 3 and 4) 

in ECs of mice. a, Strategy to generate a conditional Foxol mutant allele in | with 4-OHT. Recombination of the floxed Foxo1 allele (A) occurs 

which exons 2 and 3 are flanked by Jox sites. The structures of the genomic _ only in 4-OHT-injected animals that are Pdgfb-creERT2-positive. 

locus, the targeting vector, and the targeted allele are shown. FRT-Neo-FRT, e, Immunofluorescence staining for FOXO1, VE-cadherin (VECAD) and 


neomycin resistance cassette flanked by FRT sites. TK1, thymidine isolectin-B4 (IB4) in a P5 mouse retina of 4-OHT-injected control and 
kinase. b, Table of viable offspring from Tie2-cre;Foxo 1+ (male) and Foxol'#©*° mice. £, Confocal images of mTmG* control- and Foxo [CKO 
Foxol!"!' (female) intercrosses. ¢, Control (Foxo I") and Foxo 12° *° mice that were injected with 4-OHT from P1 to P4 and analysed for GFP, 
mutants (Tie2-cre;Foxo I") at E10.5. d, PCR of genomic DNA from P5 ERG and IB4 expression. 


control (Foxo 1, lanes 1 and 3) and Foxo1‘2C-X° (Pdgfb-creERT;Foxo YS, 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Control 


eV) 


FoxoTe 


@ ERG VECAD 


COL 


Extended Data Figure 2 | Endothelial FOXO1 deficiency leads to 
abnormal vessel size and shape. a, Immunostaining for VECAD and 

ERG in Foxo1““*° and control retinas. The bottom panels show the 
isolated VECAD and ERG signals of the inset. b, Confocal images showing 
maximum intensity projections and X-Y, X-Z, and Y-Z planes of a 

thick stack of IB4 and collagen IV (COL) stained P5 retinas. Foxo 1'#°-° 
mice develop enlarged vessels with abnormal lumen organization. White 
arrowheads point to areas with multiple vessel layers and intraluminal 
collagen strands. c, Images of IB4- (cyan) and TER119- (red) stained P5 
retinas of control and Foxo1'#—*° mice. Note that aggregates of TER119* 
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red blood cells form in Foxo1!“©*° but not in control mice. d, Phospho- 
histone H3 (pHH3) and IB4 immunostaining of P5 Foxol!=©*° and 
control mice. e, Images of IB4-stained retinas at P21 showing an increased 
vessel density in Foxo1'8~*° mice (same samples as in Fig. 1h). f, Higher 
magnification images of ERG-, ICAM2- and IB4-stained retinas at P21 
showing increased numbers of ECs in the perivenous plexus of Foxo 1/20 
mice. g, Bar graphs showing the mean endothelial area (n > 8), mean 
diameter of central vein (n > 8), and number of ERG/IB4* cells (n > 4) in 
P21 retinas of Foxo1'*©*° and control mice. Data represent mean +s.d. 
Two-tailed unpaired t-test. ****P < 0.0001. 
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Extended Data Figure 3 | Inducible overexpression of a constitutively 
active FOXO1 mutant in ECs of mice. a, A cassette containing the CAG 
promoter, a floxed STOP sequence, a cDNA encoding for Foxo1, and 
IRES-GFP was inserted into the Rosa26 locus. A schematic representation 
of the wild-type Rosa26 locus, the floxed allele, and the recombined allele 
after cre expression is shown. b, Immunofluorescence staining for FOXO1, 
GFP and PECAM in P5 Foxo1!“~“ and control mice. c, Confocal images 
of mImG* control and Foxo1'*~ mice that were injected with 4-OHT 
from P1 to P4 and analysed for GFP, ERG and IB4 expression. The right 
half of both images shows the GFP signal alone. d, High-magnification 
images of IB4-stained retinal vessels at the angiogenic front in control 
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and Foxo1'#© pups. e, BrdU and IB4 labelling of whole-mount P5 
retinas reveals reduced endothelial proliferation in Foxo1'"~@ animals. 
f, Confocal images showing MYC and PECAM immunostaining in P5 
retinas of control and Foxo1'*~ mice. The lower half of both images 
shows the MYC signal alone. g, Quantification of FOXO1 nuclear 
staining intensity in ECs (n = 3), radial migration (m= 10), endothelial 
coverage (n = 10), branch points (nm = 10), and endothelial BrdU 
incorporation (n > 6) in P5 retinas of control and Foxo1#© mutant 
mice. Data represent mean + s.d. Two-tailed unpaired f-test. **P < 0.01; 
#*EP < 0.001; ****P < 0.0001. 
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Extended Data Figure 5 | Forced expression of FOXO1 does not induce 
apoptosis, senescence, autophagy or energy distress in cultured ECs. 

a, Immunoblot analysis and quantification of FOXO1 protein levels in 
AdCTL and AdFOXO1“-Flag transduced HUVECs (n = 20). b, ATP 
levels in ECs 24h after transduction with AdCTL or AAFOXO1™ (n=7). 
c, Western blot images and quantification of AdCTL- or AdFOXO1°4— 
Flag-transduced HUVECs showing that FOXO1 does not alter the 
phosphorylation of AMPKa (Thr 172) or of its substrate ACC (Ser 79). 
Oligomycin (Oligo), positive control. TUB, tubulin. n= 10. d, Western 
blotting of AdCTL- or AAFOXO1°-Flag-transduced HUVECs 
illustrating that overexpression of FOXO1 does not induce apoptotic 
cell death. Cleaved caspase3 (CASP3) and PARP served as markers of 
apoptosis. Cycloheximide (CHX) and TNF-a (TNF) costimulation, 


positive control. e, Analysis of senescence-associated genes by microarray 
demonstrating that senescence markers were not significantly changed 

or even downregulated in FOXO1°4-overexpressing ECs. n = 3. f, Images 
of 8-galactosidase stainings in AdCTL- and AdFOXO1°*-transduced 
HUVECs showing no increase in senescence-associated 3-galactosidase 
activity (SABG). g, Densitometric quantification of the LC3-II to LC3-I 
ratio in AdCTL- or AdFOXO1-transduced HUVECs (n= 10). 

h, Immunofluorescence analysis of AdCTL- and AdFOXO1°4-transduced 
HUVECs (both coexpressing GFP) using LC3 and GFP antibodies. 
Chloroquine (CQ), positive control. DAPI, endothelial nuclei. Data in 
a-c, e and g represent mean + s.d. Two-tailed unpaired t-test. *P < 0.05; 
*** P< 0.0001; NS, not significant. 
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Extended Data Figure 6 | FOXO1 represses MYC signalling in ECs. 

a, Microarray expression analysis of FOXO1 and of canonical FOXO 

target genes in AdFOXO1- and AdCTL-expressing HUVECS 16 h 

after transduction (n =3). b, GSEA of the FOXO1 DNA-binding element 
(TTGTTTAC) gene set in ACFOXO1°- or AdCTL-transduced ECs. 

ES, enrichment score; NES, normalized enrichment score. c, GSEA of MYC 
gene signatures’ ” showing the downregulation of MYC target genes in 
FOXO1°-expressing HUVECs. d, qPCR expression analysis of MYC at 3, 6 
and 16 hin AdCTL and AdFOXO1“-transduced HUVECs (n > 4). e, GPCR 
analysis (n= 4) of Myc mRNA levels in ECs isolated from Foxo1°* mice 24h 
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after transduction with a control or Cre (AdCre) adenovirus. f, Immunoblot 
analysis of MYC in ECs isolated from Foxo1 mice following transduction 
with AdCTL or AdCre (n = 3). Cre-mediated recombination gave rise to a 
2.8 + 0.3-fold increase in FOXO1 protein expression. g, Expression analysis 
of MYC in HUVECs by western blotting after RNA interference (RNAi)- 
mediated knockdown of FOXO1 (siFOXO1). siSCR, scrambled control 
(n=3). h, MYC protein expression in ECs isolated from Foxo "mice 24h 
after transduction with an AdCTL or AdCre-encoding adenovirus (n= 3). 
a, d-h, Data represent mean + s.d., two-tailed unpaired f-test. *P < 0.05; 
**P< 0.01; ***P < 0.001; ****P < 0.0001. 
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Extended Data Figure 7 | FOXO1 interferes with MYC signalling at 
different levels. a, Western blot analysis of MYC, MXI1 and FBXW7 in 
AdCTL- or AdFOXO1°-Flag-transduced HUVECs. b, Immunoblot 
analysis and quantification of MYC protein levels in AdCTL and 
AdFOXO1°-Flag transduced HUVECs that were co-treated with the 
proteasomal inhibitor MG132 (n =3). ¢, Analysis of MYC protein half- 
life in AACTL- or AAFOXO1“-Flag-transduced HUVECs. The day after 
transduction, HUVECs were treated with cylcoheximide (CHX) and 


incubated for the times indicated. Data represent mean 4 


t s.d. Two-way 


ANOVA with Bonferroni’s multiple comparison post-hoc test. d, e, GPCR 
(d) and immunoblot analysis (e) of MYC levels in control (siSCR) or 
MXI1 (siMXI1) siRNA-transfected HUVECs that were also transduced 
with AdCTL or AAFOXO1°-Flag (n > 5). f, qPCR analysis of MYC target 
genes in siSCR or siMXI1-transfected HUVECs that were cotransduced 
with AdCTL or AdFOXO1“ (n> 3). Data represent mean + s.d. 
One-way ANOVA with Bonferroni’s multiple comparison post-hoc test 
was performed in b, d, e and f. *P < 0.05; **P < 0.01; ***P < 0.001; 
#EEED < 0.0001. 
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Extended Data Figure 8 | MYC regulates genes involved in cell HUVECs (n= 3). Genes highlighted in red indicate genes that are also 
metabolism and growth in ECs. a, b, Analysis of MYC expression by suppressed by FOXO1 overexpression. f, Table of KEGG gene sets 
qPCR (a) and immunoblot (b) in scrambled (siSCR) and MYC (siMYC) enriched among genes downregulated in the MYC siRNA-transfected 
siRNA-treated HUVECs 24 h after transfection (n = 7). c, GSEA of the ECs. g, Expression analysis of FOXO1-regulated MYC target genes by 
MYC (CACGTG) DNA-binding element gene set in sisCR- or siMYC- qPCR in MYC-silenced HUVECs (n > 4). a, g, Data represent mean + s.d., 
transfected HUVECs. d, GSEA of MYC gene signatures*” * showing the two-tailed unpaired t-test. *P < 0.05; **P < 0.01; ***P < 0.001; 
downregulation of MYC target genes in MYC-depleted HUVECs. EE P < 0.0001. 


e, Heat map of downregulated MYC signature genes in MYC-silenced 
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Extended Data Figure 9 | MYC isa critical driver of endothelial overexpression of MYC (2.4 + 0.8-fold MYC overexpression) enhances 
proliferation, growth and metabolism. a, b, IB4 and pHH3 (a) or vascular growth as indicated by the parameters assessed at P5 (n> 6). 
BrdU (b) labelling of P5 retinas reveals reduced endothelial proliferation g, ERG and IB4 labelling of P5 retinas showing an increase in cellularity 
in Myc®C*° mice. c, ICAM2, IB4 and COL staining of retinas at P5 in vessels of Myc'#©-F mice. h, Enhanced EC proliferation in Myci#©0# 
showing an increased number of empty (COL”, IB47) sleeves (white mice as revealed by BrdU and IB4 costaining. i, j, Overview (i) and higher 
arrows) in the plexus of Myc'*©*° mutants. d, Quantitative analysis of magnification images (j) of ICAM2-, IB4- and COL-stained retinas at P21 
the indicated vascular parameters in P5 retinas of control and Myc!#©*° showing aberrant vascular growth and venous enlargement in Myci2©0£ 
mice (n> 8). e, ECAR (n= 4) and OCR (n=4) in AdMYC-transduced mice. k, Increased endothelial cellularity in veins of Myci#C-OF mice at P21. 
HUVECs showing a heightened metabolic activity in MYC-overexpressing d-f, Data represent mean +s.d., two-tailed unpaired t-test. *P < 0.05; 
ECs (6.8 + 1.4-fold MYC overexpression). f, Pdgfb-creERT2-mediated **D< 0.01; ***P< 0.001; ****P < 0.0001. 
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Extended Data Figure 10 | Restoration of MYC signalling in FOXO1©4- = AdCTL-, AAFOXO1%-, AdFOXO1°4/AdMYC- and AdMYC-transduced 
overexpressing endothelium normalizes vascular growth. a,b, Confocal | HUVECs showing that ROS levels increase again in FOXO1@*/MYC 


images (a) and quantification (b) of ERG- and IB4-stained P5 retinas in co-expressing ECs (n > 6). b, c, Data represent mean = s.d., one-way 
control, Foxo 12°“, MyciBCOF and Foxo LECCA/MyclEC-OF mice (same ANOVA with Bonferroni’s multiple comparison post-hoc test. *P < 0.05; 
samples as in Fig. 4h) showing that EC numbers are normalized in the **P < 0.01; ***P < 0.001; ****P < 0.0001. 


Foxo LECCA] MyBO OF double mutants (n > 3). c, Relative ROS levels in 
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Tuft-cell-derived IL-25 regulates an intestinal 
ILC2-epithelial response circuit 


Jakob von Moltke!, Ming Ji!*, Hong-Erh Liang! & Richard M. Locksley!*? 


Parasitic helminths and allergens induce a type 2 immune response 
leading to profound changes in tissue physiology, including 
hyperplasia of mucus-secreting goblet cells! and smooth muscle 
hypercontractility”. This response, known as ‘weep and sweep’, 
requires interleukin (IL)-13 production by tissue-resident group 
2 innate lymphoid cells (ILC2s) and recruited type 2 helper T cells 
(Ty2 cells)?. Experiments in mice and humans have demonstrated 
requirements for the epithelial cytokines IL-33, thymic stromal 
lymphopoietin (TSLP) and IL-25 in the activation of ILC2s*"'" but 
the sources and regulation of these signals remain poorly defined. 
In the small intestine, the epithelium consists of at least five 
distinct cellular lineages’, including the tuft cell, whose function 
is unclear. Here we show that tuft cells constitutively express IL-25 
to sustain ILC2 homeostasis in the resting lamina propria in mice. 
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Figure 1 | Intestinal tuft cells Saiee express 1125. a, Indicated 
tissues from 1125'?°/'5 mice stained for RFP (red), EPCAM (green) and 
4',6-diamidino-2-phenylindole (DAPI; blue). b, Flow cytometry of digested 
jejunum. c, d, Jejunum from 1125'°"5 mice stained as indicated. Dotted 
lines outline villi. Arrowheads indicate RFP* cells. e, f, Quantitative 
polymerase chain reaction with reverse transcription (RT-PCR) on cells 


After helminth infection, tuft-cell-derived IL-25 further activates 
ILC2s to secrete IL-13, which acts on epithelial crypt progenitors to 
promote differentiation of tuft and goblet cells, leading to increased 
frequencies of both. Tuft cells, ILC2s and epithelial progenitors 
therefore comprise a response circuit that mediates epithelial 
remodelling associated with type 2 immunity in the small intestine, 
and perhaps at other mucosal barriers populated by these cells. 
To study the source and regulation of 1/25 in vivo, we generated 
a knock-in mouse termed Flare25 (flox and reporter of I/25; 
7125'?5/F25) that expresses tandem-dimer red fluorescent protein (RFP) 
from the 1/25 locus and enables conditional deletion of IL-25 activity 
(Extended Data Fig. 1a). Immunohistochemistry and flow cytometry 
revealed RFP only in rare epithelial (epithelial cell adhesion mole- 
cule (EPCAM)*) cells throughout the digestive tract (Fig. 1a, b and 
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sorted from small intestines of [125~/~ (e) and 1125!?*/'° (e, f) mice. n/a, 

not applicable. g, Flow cytometry of cells sorted from small intestines of 
1125'9/F25 mice and stained with anti-DCLK1. Scale bars, 501m. All data are 
biological replicates. Data are representative of two (b-d, g), or at least three 
(a, e, f) experiments. In a, n> 5; in b-d, g, n= 2; ine, f,n=3. 
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Figure 2 | Worm infection induces IL-13-dependent tuft cell 
hyperplasia. a—c, Jejunum from []25!75/"?5 mice stained for RFP (a) or 
DCLK1 (b, c). d-h, Immunohistochemical quantification of tuft cells 
(DCLK1°*) in duodenum/jejunum of mice infected with N. brasiliensis for 
indicated days (d) or 7 days (e) or injected with indicated protein (f-h). 
Scale bars: 50|1m (a, b), 1 mm (c). All data are biological replicates. 


Extended Data Fig. 2). We also found RFP in epithelial cells of the 
trachea and gall bladder, but not in haematopoietic cells (Extended 
Data Figs 2 and 3a). 

The small intestinal epithelium consists of a single cell layer con- 
tinuously repopulated from stem cells in underlying crypts; cells pro- 
gress up the villi and are sloughed into the lumen with a turnover 
of 3-5 days. Nascent progenitors proliferate in the transit amplifying 
region before fate commitment to become absorptive enterocytes or, 
less frequently, one of four secretory cell types: Paneth, enteroendo- 
crine, goblet, or tuft!*!3. We tested whether Flare25 marks one or 
more secretory lineages. Immunohistochemistry showed no colo- 
calization of RFP with the enteroendocrine marker chromogranin 
A (CHGA), the Paneth-cell markers lysozyme (LYZ)1 and LYZ2, or 
the goblet-cell marker mucin 2 (MUC2) (Fig. 1c and Extended Data 
Fig. 4a, b). Unexpectedly, expression of RFP and the tuft-cell markers 
doublecortin-like kinase 1 (DCLK1) and epithelial prostaglandin- 
endoperoxide synthase 1 (PTGS1) completely overlapped (Fig. 1d and 
Extended Data Fig. 4a, b). Transcriptional analysis comparing sorted 
RFP*EPCAM™ with RFP" EPCAM*‘ intestinal epithelium demon- 
strated [125 expression almost exclusively in RFP* cells (Fig. le), and 
confirmed co-staining results (Fig. 1f and Extended Data Fig. 3b). 
The tuft-cell markers Dclk1, Ptgs1, Gnat3, Chat, Gfilb and Trpm5 
(ref. 14, 15) were each enriched at least 750-fold in RFP* cells, while 
Chga, Muc2, Lyz1 and Lyz2 showed no enrichment (Fig. 1f). Finally, 
>99% of sorted REPTEPCAM* and <1% of RFP- EPCAM‘* cells were 
DCLK1* by flow cytometry (Fig. 1g). Given these results, and our 
identification of RFP* cells only in epithelia where tuft cells have been 
noted (Extended Data Figs 2 and 4c; data not shown)!4, we conclude 
that tuft cells constitutively express 1/25 and that all 1/25* cells are tuft 
cells, at least as assessed using this reporter. By contrast, tuft cells are 
not major sources of TSLP or IL-33 in the small intestine (Extended 
Data Fig. 5). 

Although intestinal tuft cells (also called brush cells) were discov- 
ered more than 50 years ago, their function remains largely unknown. 
Given the link between IL-25 and type 2 immunity, we investigated 
the role of tuft cells during infection of mice with the roundworm 
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Data are representative of at least three (a—c) experiments or pooled (d-h) 
from multiple experiments. In a—c, n > 10; in d, day 2: n = 2; days 4, 12, 
14: n= 4; day 9: n=5; days 6, 8, 10: n= 6; day 7: n= 8; in e-h, nis as 
shown. Nb, N. brasiliensis. *P < 0.05; **P < 0.01, ***P < 0.001; NS, not 
significant (Mann-Whitney test). Error bars represent mean + standard 
error of the mean (s.e.m.). 


Nippostrongylus brasiliensis, which induces a strong type 2 immune 
response that clears intestinal worms 7-10 days post-infection (d.p.i.). 
Although tuft cells account for <1% of intestinal epithelium in unin- 
fected mice, we found dramatic (>15-fold) tuft cell hyperplasia in 
the small intestine 7 d.p.i. (Fig. 2a—c). The extent of hyperplasia was 
uniform throughout the small intestine (Extended Data Fig. 6), but we 
focused further experiments on the duodenum and jejunum, where 
N. brasilienesis resides. We observed no hyperplasia in the stomach or 
colon (Extended Data Fig. 6), through which the worms briefly transit. 
Hyperplasia in the small intestine peaked 8-9 d.p.i. and returned to 
near homeostatic levels by 14 d.p.i. (Fig. 2d). As in uninfected mice, 
RFP* cells were CHGA~, MUC2> and LYZ1/2~, and DCLK1* and 
PTGS1* 7d.p.i. (Extended Data Fig. 7). Given the complete overlap 
of RFP and DCLK1, we used these markers interchangeably in further 
experiments. 

Since helminth-induced goblet cell hyperplasia is mediated by 
IL-13 (ref. 16), we asked whether IL-13 also mediates tuft cell hyper- 
plasia. Indeed, tuft cell hyperplasia was absent in infected inter- 
leukin 4 receptor « (1/4ra)-deficient mice, in which both IL-4 and 
IL-13 signalling is disrupted, and in J/13 deleter mice (1113°re/"re, 
Gt(ROSA)26510P flox:DTA/+) (Fig. 2e). Because tuft cell hyperplasia 
was normal in IL-4-deficient mice ([]4*%7/N7) (Fig. 2e), and because 
the predominant role of IL-13 in IL-25-mediated pathologies is well 
established!”'8, we conclude that IL-13 is the primary signal driving 
tuft cell hyperplasia in vivo. We also found [/4/13-dependent tuft cell 
hyperplasia after infection with Heligmosomoides polygyrus, another 
intestinal parasite (Extended Data Fig. 7c). 

Lamina propria ILC2s are the principal intestinal source of IL-13 
in the first week of N. brasiliensis infection'®”°. Accordingly, tuft cell 
hyperplasia was absent or reduced 7 d.p.i. in mice lacking nearly all 
lymphoid cells (7ra~~, Tl2rg-') or IL-5* ILC2 cells (J15 deleter: 
TIS! Gt(ROSA) 2651 OP flox::DIA/STOP flox::DTA) (Fig. 2e). Hyperplasia 
was greatly reduced in Ragl~/~ mice, but nearly normal in 1/4/13"; 
Cd4-cre mice (Fig. 2e), consistent with the model that T cells produce 
little IL-13 at early time points but can boost IL-13 levels by supporting 
ILC2 activation”). 
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Figure 3 | IL-13 signalling in epithelial progenitors gives rise to tuft 
cell hyperplasia. a, Flow cytometric quantification of RFP* tuft cells 

in intestinal organoids grown from [/25'/'5 mice and treated with 
indicated recombinant proteins (20 ng ml~ 1) b, Immunohistochemical 
quantification of tuft cells (DCLK1*) in duodenum/jejunum of 

mice infected 7 days with N. brasiliensis. c, Jejanum of Lgr5‘?!"/*; 
Gt(ROSA)26510P flox:RFP/+ mice treated 5 days with tamoxifen and 
stained for DCLK1 (green) and DAPI (blue). N. brasiliensis infection as 
indicated. d, e, Jejunum of 1125'?°/'5 (d) or Wt(B6) (e) mice infected for 


To test if type 2 cytokines are sufficient to induce tuft cell hyperpla- 
sia, we injected mice with stabilized IL-4—anti-IL-4 complexes that 
mimic IL-13 signalling, or with IL-25 or IL-33. All three treatments 
expanded tuft cells, but the effects of IL-25 and IL-33 were IL4RA- 
dependent and severely reduced in the absence of lymphoid cells 
(7ra~'~, l2rg-'~), I15* cells, or 1113* cells (Fig. 2f, g), suggesting 
that IL-25 and IL-33 trigger hyperplasia indirectly by inducing IL-13 
production in ILC2s. By contrast, tuft cell hyperplasia was reduced 
only partially in I/7ra~'~ and Il2rg"'~ mice injected with IL-4 (Fig. 2h), 
raising the possibility that exogenous IL-4 or endogenous IL-13 
induces tuft cell hyperplasia by directly targeting the epithelium. 

Consistent with this model, recombinant IL-4 and IL-13 induced 
tuft cell hyperplasia in intestinal organoids, which contain only epithe- 
lial cells; by contrast, IL-25 and IL-33 neither induced hyperplasia nor 
enhanced induction by IL-4/IL-13 in this system (Fig. 3a and Extended 
Data Fig. 3c, d). As expected, tuft cell hyperplasia was absent in 14ral, 
Vill-cre mice, confirming that tuft cell hyperplasia requires IL4RA 
signalling in the intestinal epithelium in vivo (Fig. 3b). We conclude 
that ILC2-derived IL-13 signals through IL4RA in the intestinal epi- 
thelium to induce tuft cell hyperplasia. 

We examined several possible mechanisms of IL-13-induced tuft cell 
hyperplasia. First, CHGA* cell numbers did not change during infec- 
tion (Extended Data Fig. 8b, c), confirming a selective increase in tuft 
cells rather than a global expansion of intestinal epithelium. Through 
lineage tracing, we confirmed that tuft cells, as under resting condi- 
tions, continue to arise from Lgr5~ stem cells during N. brasiliensis- 
induced hyperplasia (Fig. 3c). Since [l4ra is expressed throughout the 
intestinal epithelium?””’, tuft cell expansion could occur either before 
or after lineage commitment; however, we found expression of the 
proliferation marker Ki67 in only a few nascent tuft cells (Fig. 3d), 
suggesting that tuft cell hyperplasia is induced in the stem or transit 
amplifying compartments. Consistent with this model, kinetic studies 
revealed a wave of hyperplastic tuft cells appearing near the crypts 
6d.p.i. and moving up the villi by 8 d.p.i. (Fig. 3e). Taken together, 
these results suggest that IL-13 signalling in uncommitted intestinal 
epithelium shifts cell fate decisions towards the tuft (and goblet) cell 
lineage, perhaps by altering the balance of Notch signalling'”. Indeed, 
the Notch signalling inhibitor N-[N-(3,5-Difluorophenacetyl)- 
L-alanyl]-S-phenylglycine t-butyl ester (DAPT) also induced tuft cell 
hyperplasia in organoids (Extended Data Fig. 3e-f). 

Since ILC2s secrete IL-13 in response to IL-25, and tuft cells are 
the source of intestinal IL-25, we hypothesized a feed-forward circuit 
between tuft cells, ILC2s, and epithelial progenitors (Extended Data 
Fig. 8d). Tuft cells constitutively express [125 (Fig. la, e) and some 
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7 days (d) or indicated number of days (e) with N. brasiliensis and stained 
for RFP (red) and Ki67 (green) (d) or DCLK1 (green) and DAPI (blue) 
(e). d, e, Dotted lines outline villi. Scale bars, 50 1m. Arrowheads indicate 
Ki67 and REP overlap. Data in a are technical replicates, all other data 

are biological replicates. Data are representative of two (c-e) or three 

(a) experiments or pooled (b) from multiple experiments. In a, b, 7 is as 
shown; in c, uninfected: n = 4; 7 d.p.i. n= 5; ind, n=2;ine,n=4. Nb, 
N. brasiliensis. *P < 0.05; NS, not significant (Mann-Whitney test). Error 
bars represent mean + s.e.m. 


lamina propria ILC2s constitutively express [113 (ref. 24), so we first 
examined the interaction of tuft cells and ILC2s in uninfected mice. 
Using 11135""/S"r' reporter mice to mark IL-13-secreting cells, we 
found that ~20% of lamina propria ILC2s make IL-13 in the absence 
of infection, and this is dependent on IL-25 (Fig. 4a and Extended 
Data Fig. 9a). The frequency of ILC2s is also decreased in 1125-/~ 
mice (Fig. 4b), suggesting that tuft-cell-derived IL-25 promotes ILC2 
maintenance in the small intestine. ILC2 activation remains IL- 
25-dependent at 4d.p.i., the latest time point before worm clearance 
at which we could recover viable cells from infected intestines (Fig. 4a 
and Extended Data Fig. 9a). Ty2 cells are not a major source of IL-13 
at rest or 4d.p.i. (Extended Data Fig. 9b, c). 

Our model of an ILC2-epithelial circuit predicts that loss of 
homeostatic IL-25 or IL-13 would reduce the frequency of tuft cells. 
Indeed, the already small number of tuft cells was further reduced in 
uninfected I125~/~ and Il4ra~’~ mice (Fig. 4c), with the remaining 
tuft cells in these mice probably representing stochastic production 
independent of type 2 immunity. The frequency of CHGA‘ cells was 
unchanged in [125-/~ mice (Extended Data Fig. 8b, c). We generated 
1125*25/F25. Vil] -cre mice to delete [125 selectively from the epithelium 
(Extended Data Fig. 1b, c). The basal frequency of tuft cells again 
decreased (Fig. 4d), confirming that tuft cells are the relevant source 
of IL-25 upstream of homeostatic production of IL-13 by intestinal 
ILC2s. 

The dependence of tuft cell frequency on autocrine IL-25 was even 
more striking during N. brasiliensis infection. Tuft cell hyperplasia was 
absent in 1125~/~ and I125'?°/F25. Vill -cre mice 7 d.p.i, but unaffected by 
the absence of TSLP or IL-33 signalling (Fig. 4e, f). Taken together, our 
data support a model in which IL-25 from tuft cells induces ILC2s to 
produce IL-13, which in turn regulates the frequency of tuft cells in the 
intestinal epithelium. The capacity of IL-13 to alter the cellular com- 
position of the epithelium raised the possibility that tuft cells might 
also regulate other secretory lineages. While neither CHGA* nor 
Paneth cell hyperplasia occurred during worm infection (Extended 
Data Fig. 8a—c), goblet cell hyperplasia and hypertrophy were absent 
in infected mice lacking epithelial IL-25 (Fig. 4g-i). Moreover, as in 
1125~/~ mice, 1125'?5/"°. Vill-cre mice failed to clear worms by 10d.pi., 
thereby identifying tuft cells as key regulators of the type 2 immune 
response (Fig. 4j). 

Our findings uncover an unexpected role for tuft cells in intestinal 
immune defence, suggest a link between type 2 immune signalling 
and epithelial cell fate decisions, and describe an ILC2-epithelial 
response circuit that regulates the cellular composition of the intes- 
tinal epithelium. This circuit could integrate homeostatic signals, 
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Figure 4 | Tuft cells regulate intestinal physiology through an 
ILC2-epithelium response circuit. a, b, Flow cytometric analysis of 
lamina propria ILC2s (Lin~, CD45*+, GATA3*) from 11138"”+ and 
1125~/~;11135""""* mice. c, d, Flow cytometric quantification of tuft cells 
(DCLK1*) in jejunum of uninfected mice. e, f, Immunohistochemical 
quantification of tuft cells (DCLK1*) in jejunum/duodenum of mice 
infected 7 days with N. brasiliensis. In e, wild-type controls are the same 
as in Fig. 2e. g, Jejunum of mice treated as indicated and stained for goblet 


such as ILC2 activation during feeding”, with additional signals to 
tune the barrier’s absorptive-secretory balance. Indeed, despite the 
feed-forward nature of the circuit, tuft and goblet cell hyperplasia are 
restrained during homeostasis, suggesting that worm infection pro- 
vides another activating signal or removes an inhibitory signal. Given 
their positioning in the intestinal epithelium, tuft cells appear poised 
to monitor luminal homeostasis and transduce activating signals to 
immune cells in the lamina propria. Interestingly, tuft cells encode the 
complete bitter and umami taste transduction pathways and release 
acetylcholine when activated". In the airways and urethra, this path- 
way has been linked to smooth muscle contraction and is proposed 
to promote innate defence against invading bacteria”>**. 

In the absence of IL-25, other signals such as IL-33 become 
induced? and can activate ILC2s to mediate expansion of tuft and 
goblet cells associated with worm clearance. Nonetheless, our findings 
delineate a key role for tuft cells in the physiological host response 
to helminths. Tuft cells appear to be the primary source of IL-25 in 
the lung as well (Extended Data Fig. 4c); thus, their involvement may 
extend to other conditions in which IL-25 has been implicated, such 
as airway disease!!° and allergic diarrhoea*”. Indeed, mucosal tuft 
cell hyperplasia may be a generalizable hallmark of type 2 immune 
responses, and strategies to short-circuit tuft cell activation may have 
therapeutic potential for these widespread afflictions. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 
1125 reporter mice. Flare25 mice were generated by homologous gene targeting in 
C57BL/6 embryonic stem cells. A 2.2 kb 3’ homology arm beginning in the 3’ UTR 
of 1125 was amplified from C57BL/6 genomic DNA and cloned into pkKO915-DT 
(Lexicon Genetics) using BamHI and HindIII. Next, a DNA strand encoding (from 
5! to 3’) a loxP site and the complete third exon of [/25 was synthesized (Blue 
Heron). This synthetic strand and genomic DNA were used as PCR templates to 
generate a 2.1 kb 5’ homology arm by overlap extension PCR. The 5’ homology arm 
was cloned into the pK0915-DT vector containing the 3’ homology arm using XhoI 
and EcoRI. Finally, a reporter cassette encoding (in order from 5’ to 3’): a loxP site, 
encephalomyocarditis virus IRES, tandem REP, bovine growth hormone poly(A), 
and a frt-flanked neomycin resistance cassette, was subcloned into the homology 
arm containing pKO915-DT vector using Ascl. The final construct was linearized 
with NotI and transfected by electroporation into C57BL/6 embryonic stem cells. 
Cells were grown on irradiated feeders with the aminoglycoside G418 in the media, 
and neomycin-resistant clones were screened for 5’ and 3’ homologous recombina- 
tion by PCR. Four positive clones were selected and further tested to confirm inser- 
tion of the 5’ loxP site. Two clones were selected for injection into albino C57BL/6 
blastocysts to generate chimaeras, and the male pups with highest ratios of black- 
to-white coat colour from a single clone were selected to breed with homozygous 
Gt(ROSA26)*?!/F!?! females (Jackson Laboratories catalogue no. 009086) to excise 
the neomycin resistance cassette. Deletion of neomycin was confirmed by PCR. 
Flare25 genotyping primers were as follows: KI_F: GIATTGGGTGCCAGAACAG; 
KI_R: GGGTCGCTACAGACGTTGTTTGTC (715 bp knock-in band; 
374 bp floxed band); WT_F: ACTTTACCACAACCAGACG; WT_R: 
AGTTTCTCCCCAAGTCCTCC (290 bp wild-type band). 
Other mice. Mice were maintained in the University of California San Francisco 
(UCSF) specific pathogen-free animal facility in accordance with the guidelines 
established by the Institutional Animal Care and Use Committee and Laboratory 
Animal Resource Center. All experimental procedures were approved by the 
Laboratory Animal Resource Center at the UCSE. Mice aged 6-12 weeks were used 
for all experiments. Mice were age- and sex-matched in figures displaying a single 
representative experiment. Pooled results include both male and female mice. For 
some experiments, mice encoding a reporter allele that does not impact endog- 
enous gene expression (B6.1113°""" or B6.Arg1™48°) were used as wild-type con- 
trols. Il7ra~/~ (B6.129S7- I7r'™ J; 002295), RagI~/~ (B6.129S7-Rag1'™!Mom yy; 
002216), Cd4-cre (B6.Cg-Tg(Cd4-cre); 022071) and wild-type (C57BL/6]; 000664) 
mice were purchased from Jackson Laboratories. []4ra~/~ (BALB/c-Il4ra"™!$“/J; 
003514) mice were purchased from Jackson Laboratories and backcrossed to 
C57BL/6J for at least eight generations. [/2rg~/~ (B10;B6-Il2rg'™; 4111-F) 
were purchased from Taconic as Rag-2~/~;[l2rg~/~ and outcrossed to isolate the 
Tl2rg allele. B6.1125-/-, B6. Tslpr~/~, B6.1133r~/~, Bo. 138 /Sat, BG, T]4KN2/KN?, 
Bo. Tssre/ere, Gt(R OSA) 268TOP -flox::DTA/STOP.flox::DTA and B6.II1 3ere/cre. 
Gt(ROSA) 26910? flox:DTA/+ mice were obtained or generated as described*!?4, 
BALB/c.1/4/13" mice were generated as described”? and backcrossed to 
C57BL/6) for at least eight generations. B6.1/4ra!" mice were provided by 
A. Chawla. B6.T¢(Vill-cre) mice were provided by A. Ma. B6.Lgr5#&ipsre-Ert2/+, 
Gt(ROSA) 26510? flox:RFP/+ mice were provided by O. Klein. 
Mouse infection and treatment. Infectious third-stage N. brasiliensis larvae (L3) 
were raised and maintained as described'®. Mice were infected subcutaneously with 
500 N. brasiliensis L3 or by oral gavage with 200 H. polygyrus L3, and were killed 
at the indicated time points to collect tissues for staining or to count intestinal 
worm burden, as described!’. Mice were given IL-4, IL-25, and IL-33 as follows: 
IL-4 complexes were generated by incubating 21g mouse IL-4 (R&D Systems) 
with 10j.g LEAF purified anti-mouse IL4 antibody (clone 11B11, Biolegend) for 
30 min at room temperature, and then administered on day 0 and day 2. IL-25 
and IL-33 were given in doses of 500 ng on days 0, 1, 2, and 3. All injections were 
given intraperitoneally in 200 11, and all intestines were harvested for sectioning 
and staining on day 4. For lineage tracing, 2.5 mg of tamoxifen in 25011 corn oil 
were given intraperitoneally 5 days before harvest. 
Fixed tissue preparation and staining. For immunohistochemistry, tissues were 
fixed in 4% paraformaldehyde for 3 h at 4°C followed by PBS wash and overnight 
incubation in 30% (w/v) sucrose. For stomach, small intestine, caecum and large 
intestine, tissues were flushed with PBS before fixation. Unless otherwise noted, 
the proximal 10-12 cm of small intestine (duodenum and partial jejunum) were 
harvested. Tissues were embedded in Optimal Cutting Temperature Compound 
(Tissue-Tek) and stored at —80°C before sectioning (8-10 |1m) on a Cryostat 
(Leica). To facilitate analysis of the entire sample, small and large intestines were 
coiled into a ‘Swiss roll’ before embedding. 

Immunohistochemistry was performed in Tris/NaCl blocking buffer (0.1 M 
Tris-HCl, 0.15 M NaCl, 51g ml! blocking reagent (Perkin Elmer), pH 7.5) as fol- 
lows: 1h 5% goat serum, 1h primary antibody, 40 min secondary antibody, 5 min 


DAPI (Roche). For RFP co-labelling experiments, slides were stained for MUC2, 
LYZ1, CHRA, or DCLK1 as described above, excluding the DAPI step. RFP stain- 
ing was then as follows: 1h rabbit IgG, 20 min each of biotin and streptavidin 
block (Vector Labs), 1h anti-RFP-biotin, 40 min streptavidin-Cy3 (Caltag), and 
5min DAPI. See Extended Data Table 1 for a list of antibodies used in this study. 

For goblet cell staining, 8-cm sections of jejunum were fixed for 3h in 10% 

buffered formalin (Fisher Scientific) at 4°C before coiling into a ‘Swiss roll and 
returning to formalin. After 24h, tissues were moved to 70% ethanol for storage. 
Tissue processing, paraffin embedding, and sectioning were performed by the 
UCSF Mouse Pathology Core. Periodic acid Schiff (PAS) and Alcian blue staining 
were performed as follows: cleared with xylenes (Fisher Scientific), rehydrated, 
30 min in Alcian blue (Thermo Scientific), 5 min in periodic acid (Thermo 
Scientific), 15 min in Schiff reagent (Thermo Scientific), dehydrated, and mounted. 
Brightfield and fluorescent images were acquired with an AxioCam HR camera on 
an AxioImagerM2 upright microscope (Zeiss). 
Tuft and goblet cell quantification. In uninfected mice, a 2.5-cm section of small 
intestine was harvested beginning 10cm distal to the stomach and processed into a 
single-cell epithelial suspension as described later. After analysis by flow cytometry, 
frequency of tuft cells was calculated as number of DCLK1*EPCAM* cells/total 
number of EPCAM‘ cells. Because viable epithelial cells cannot be harvested from 
N. brasiliensis-infected intestines from ~5 to ~12 d.p.i., immunohistochemistry 
was used to quantify tuft cell frequency in infected mice. The proximal 10cm of 
small intestine were harvested and stained for DCLK1 as described earlier. A 4 x 4 
grid of images was collected at x 200 magnification and the total area of DCLK1 
and DAPI staining above background was calculated using Image]. Tuft cell fre- 
quency was calculated as DCLK1 staining area/DAPI staining area. 

For goblet cell quantification, tissue sections were prepared and stained with 
PAS Alcian blue as described earlier. Goblet cells were manually counted and the 
total length of all analysed villi was measured using ImageJ. Goblet cell frequency 
was expressed as number of goblet cells/millimetre of villus. At least 15 villi were 
counted for each replicate. Mucus production was estimated by measuring the area 
of at least 15 goblet cells for each biological replicate. 

Single-cell tissue preparation. For single-cell epithelial preparations, small intes- 
tines were flushed with PBS, opened, and rinsed with PBS to remove luminal 
contents. Two-and-a-half- to five-cm-long segments of jejunum were incubated 
with rocking for 20 min at 37°C in 5 ml PBS containing 2.5 mM EDTA (Sigma- 
Aldrich), 0.75 mM dithiothreitol (DTT; Sigma-Aldrich), and 10,1g ml~! DNasel 
(Sigma-Aldrich). Tissues were shaken vigorously for 30s and released cells were 
incubated with rocking for 10 min at 37°C in 5 ml HBSS (Ca**/Mg?" free) con- 
taining 1.0 Uml! Dispase (Gibco) and 10,.gml~' DNasel. Digested cells were 
passed through a 70,1m filter and washed once before staining for flow cytometry. 

For lamina propria preparations, small intestine was harvested from 4-10 cm 
distal to the stomach (duodenum/jenunum), flushed with PBS, opened, and thor- 
oughly cleaned with PBS. Intestines were incubated with gentle rocking for 15 min 
at 37°C in 10 ml HBSS (Ca**/Mg""* free) supplemented with 5% fetal calf serum 
(FCS), 10mM HEPES (UCSF Cell Culture Facility), 10 mM DTT and 5mM EDTA. 
Intestines were gently vortexed, supernatants discarded, and incubation repeated 
with fresh DIT/EDTA solution. Next, intestines were incubated with gentle rock- 
ing for 20 min at 37°C in 20 ml HBSS (Ca”*/Mg”* replete) supplemented with 5% 
FCS and 10mM HEPES. After incubation, intestines were gently vortexed, cut 
into small pieces and incubated with gentle rocking for 30 min at 37°C in 5ml 
HBSS (Ca**/Mg?" replete) supplemented with 5% FCS, 10mM HEPES, 30g ml! 
DNasel, and 0.1 Wunsch ml! LiberaseTM (Roche). After digest, intestines were 
mechanically dissociated in GentleMACS C tubes (Miltenyi Biotec), passed 
through a 70,.m filter, and washed. The resulting cell pellet was resuspended in 
4ml 40% Percoll (Sigma-Aldrich), underlaid with 4 ml 90% Percoll and centrifuged 
at 2,200 r.p.m. for 20 min at 4°C. The 40/90 interphase of the Percoll gradient was 
harvested, washed, and stained for flow cytometry. 

Flow cytometry. For surface staining, single-cell suspensions prepared as 
described earlier were incubated with anti-CD16 and CD32 monoclonal anti- 
bodies (UCSF Antibody Core Facility) for 10 min at 4°C. The cells were stained 
with antibodies to surface markers for 20 min at 4°C followed by DAPI for dead 
cell exclusion. See Extended Data Table 1 for a list of antibodies used in this study. 

For intracellular DCLK1 staining, single-cell epithelial suspensions were pre- 
pared from 2.5cm sections of small intestine harvested 8 cm distal to the stom- 
ach. Staining was as follows: 15 min at 4°C in Violet Live/Dead fixable stain (Life 
Technologies), 15 min at room temperature in 2% paraformaldehyde (Electron 
Microscopy Sciences), 20 min at room temperature in saponin-based perme- 
abilization and wash (perm/wash) reagent (Life Technologies) supplemented 
with 10% goat serum, 30 min with rabbit anti-doublecortin-like kinase (Abcam; 
ab31704) in perm/wash, 20 min in F(ab’)2 goat anti-rabbit IgG-Alexa Fluor 488 
(Life Technologies) and rat anti-EPCAM-PerCP-Cy5.5 (Biolegend; 16A8). 
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For intracellular GATA3 staining, single-cell lamina propria suspensions were 
prepared from the small intestines as described earlier and stained according 
to manufacturer’s protocol for FoxP3/Transcription Factor Staining Buffer Set 
(eBiosciences). 

Samples were analysed on an LSR II (BD Biosciences) with four lasers (403 nm, 
488 nm, 535 nm, and 633 nm). Samples were FSC-A/SSC-A gated to exclude debris, 
FSC-W/FSC-A gated to select single cells, and gated to exclude DAPI* dead cells. 
Data were analysed with FlowJo 10 (Treestar). 

Quantitative RT-PCR. Single-cell epithelial suspensions were isolated and stained 
as described earlier and then sorted into RFP*EPCAM* and REP- EPCAM* popu- 
lations using a MoFlo XDP (Beckman Coulter). RNA was isolated using the Micro 
Plus RNeasy kit (Qiagen) and reverse transcribed using the SuperScript Vilo Master 
Mix (Life Technologies). The resulting cDNA was used as template for quantita- 
tive PCR with the Power SYBR Green reagent on a StepOnePlus cycler (Applied 
Biosystems). Transcripts were normalized to Rps17 (40S ribosomal protein $17) 
expression. See Extended Data Table 1 for a list of primers used in this study. 

Organoid culture. Small intestinal crypt-derived organoids were grown as 
described*', replacing recombinant R-spondin with supernatants from R-spondin 
expressing L-cells (provided by O. Klein). Crypts were harvested from [125'5/5 
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mice and plated on day 0. On day 3 and day 5, media were replaced and orga- 
noids were treated with 20ng ml! of the indicated recombinant protein. On day 
6 organoids were harvested into HBSS (Ca”*/Mg?* replete) containing 200 U ml! 
Collagenase I (Gibco) and 1.8 U ml"! Dispase (Gibco). Organoids were incubated 
for 1.5h at 37°C with shaking, washed, and then stained for flow cytometry as 
described earlier. 

Statistical analysis. All experiments were performed using randomly assigned 
mice without investigator blinding. All data points and n values reflect biological 
replicates (that is, mice), except in Fig. 3a, where data points on the graph rep- 
resent technical replicates. No data were excluded. Where noted in the figures, 
statistical significance was calculated without assumption of normal distribution 
using a Mann-Whitney test. Experimental groups included a minimum of three 
biological replicates, as required by the Mann-Whitney test. Intragroup variation 
was not assessed. All statistical analysis was performed using Prism 6 (GraphPad 
Software). Figures display means + s.e.m. No statistical methods were used to 
predetermine sample size. 


31. Sato, T. & Clevers, H. Primary mouse small intestinal epithelial cell cultures. 
Methods Mol. Biol. 945, 319-328 (2013). 
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Extended Data Figure 1 | Flare25 mouse and Vil1-cre-mediated 1125 of two experiments (n = 2). Frt, target site for FLIPASE recombinase; 
deletion. a, Gene-targeting strategy for the flox and reporter of [125 IRES, internal ribosomal entry site; loxP, target site for Cre recombinase; 
(Flare25) mouse. b, PCR of genomic DNA isolated from the tail (lane 1, pA, bovine growth hormone poly(A) tail; tdRFP, tandem-dimer red 
2) or cells sorted from the small intestine (lane 3, 4) of indicated mice. fluorescent protein; UTR, untranslated region. For gel source data (b) see 
c, Quantitative RT-PCR for 1/25 on cDNA from EPCAM* cells sorted Supplementary Fig. 1. 
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Extended Data Figure 2 | 1125 expression in epithelial surfaces. DAPI (blue). Some data from Fig. 1a are repeated here to allow complete 
a, b, Indicated tissues of 1125'7*/'”° (a) and wild-type control (b) mice comparison. Scale bars, 50 jum. Images are representative of at least three 
stained by immunohistochemistry for RFP (red), EPCAM (green), and independent experiments. n = 3. 
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Extended Data Figure 3 | Flow cytometry gating strategies and organoid 
culture. a, Flow cytometric analysis of indicated tissues in [125'?°/'?° and 
wild-type mice. b, Flow cytometric analysis of small intestine epithelial 
cells of 1125'?5/¥25 mice before and after fluorescence-activated cell sorting 
(FACS) into RFPtEPCAM* and RFP- EPCAM?* pools for analysis by 
quantitative RT-PCR. c-e, Representative flow cytometric analysis of 


small-intestine-derived organoids from 1125'?*/"> (ce) and wild-type 


(c) mice cultured with or without recombinant protein (20ngml1~'), 


as indicated (c-e) or Notch 


signalling inhibitor DAPT (251M) (e). 


Single-cell suspensions of the organoids were stained for EPCAM (c-e) 
and DCLK1 (d), and gated to quantify tuft cell (RFP*EPCAM* or 
DCLK1*EPCAM ) frequency. f, Quantification of two technical replicates 
from experiment shown in e. d.p.i., days post-N. brasiliensis infection. 
Data in f are technical replicates. Data are representative of three (a, b, d) 


or two (c, e, f) independent 
Error bars represent mean 4 
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experiments. In a-d, n= 3; ine, f, n= 2. 
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Extended Data Figure 4 | [125 is expressed constitutively in tuft cells. lineage markers (green), and DAPI (blue). Scale bars, 50 1m. Images are 
a-c, Jejunum (a, b) or trachea (c) of 1125755 (a, c) and wild-type control representative of one (c) or two (a, b) independent experiments. n= 2. 
(b) mice stained by immunohistochemistry for RFP (red), indicated 
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Extended Data Figure 5 | Tuft cells are not a major source of intestinal 
TSLP or IL-33. a, Jejunum of 112525/F25 mice stained for REP (red), IL-33 
(green), and DAPI (blue). b-d, Quantitative RT-PCR on indicated (b, c) or 
REPtEPCAMt (d) cells sorted from untreated (b, c) mice or mice treated 
as indicated (d). RNA isolated from whole lung 8 days post-N. brasiliensis 


infection is used as a positive control for Tslp expression in c. Expression 
of Tslp in sorted Tslp-expressing cells of the lung would probably be higher. 
Scale bars, 50|1m. Data are representative of two independent experiments. 
Ina, n= 3; in b-d, n=2. 
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Extended Data Figure 6 | N. brasiliensis induces tuft cell hyperplasia DCLK1 (b) (red), EPCAM (green), and DAPI (blue). d.p.i., days post-N. 
throughout the small intestine but not in stomach and colon. brasiliensis infection. Scale bars, 501m. Data are representative of two 
a, b, Indicated tissues of 1125'?5/'° (a) and wild-type control (b) mice (stomach and colon) or at least three (all others) independent experiments. 
treated as indicated and stained by immunohistochemistry for RFP (a) or In a, stomach and colon: n = 2; all others: n> 5. 
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Extended Data Figure 7 | [125 is expressed only in tuft cells during H. polygyrus and stained by immunohistochemistry for DAPI (blue), 
worm infection and H. polygyrus infection also induces tuft cell EPCAM (green) and DCLK1 (red). Scale bars, 501m. d.p.i., days 
hyperplasia. a, b, Jejunum of 1125'?°/5 (a) and wild-type control (b) mice _post-H. polygyrus infection. Images are representative of one (c) or two 
infected for 7 days with N. brasiliensis stained by immunohistochemistry (a, b) independent experiments. In a, b, n= 2; in c, n= 1 (uninfected) or 
for RFP (red), indicated lineage markers (green), and DAPI (blue). n= 2 (infected). 
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Extended Data Figure 8 | Absence of Paneth and CHGA‘ cell 
hyperplasia after N. brasiliensis infection and model of ILC2-epithelial 
signalling circuit. a, b, Jejunum of indicated mice stained for DAPI 

(blue) and LYZ1/2 (a) or CHGA (b) (green). c, Quantification of CHGAt 
cells from imaging in (b). d, During homeostasis, rare epithelial tuft 

cells of the small intestine constitutively express 1/25, which maintains 
low levels of IL-13 production in lamina propria ILC2s. IL-13 in turn 
signals uncommitted epithelial progenitors to promote emergence of 

tuft and goblet cells. In the absence of infection, this feed-forward 
ILC2-epithelial circuit is restrained by as yet unknown mechanisms. After 
N. brasiliensis (N.b.) infection, a helminth-derived signal or a change in 


host physiology activates the ILC2-epithelial circuit leading to tuft and 
goblet cell hyperplasia and enhanced IL-13 production by ILC2s. Adaptive 
Ty2 cells probably also provide IL-13 and/or support ILC2 activation, 
especially when infection or inflammation lasts more than a week. 
Recombinant proteins are sufficient to induce tuft cell hyperplasia, either 
by inducing IL-13 production in lymphoid cells (IL-25 or IL-33) or by 
directly binding epithelial progenitors (IL-4). Scale bars, 501m. d.p.i., days 
post-N. brasiliensis infection. Data in ¢ are biological replicates. Data are 
representative of two (a) or three (b) independent experiments or pooled 
from multiple experiments (c). In a, n= 2; in b, ¢, n is as shown in c. Error 
bars represent mean + s.e.m. 
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Extended Data Figure 9 | IL-13 production by lamina propria ILC2 and 
CD4* cells. a, b, Lamina propria cells from [125~/~;11139™@""*, 11138", 
and wild-type control mice analysed by flow cytometry and gated on 

ILC2 (a, Lin~CD45*GATA3*) or CD45*CD4¢* (b) cells. IL-13 secretion 
was quantified by measuring surface expression of human CD4, which is 
expressed from the I/13 locus in 1113°"*" reporter mice. c, Frequency of 


: ra of 
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-/- : oft 
Wt(B6) 1/25 Wt(B6) 125 
4 d.p.i. 


Uninfected 4d.p.i. Uninfected 


lamina propria CD4* cells as a percentage of total CD45* cells as assessed 
by flow cytometry. d.p.i., days post-N. brasiliensis infection. Data in 

b, c are biological replicates. Data are representative of at least three (a) 
independent experiments, or pooled from multiple experiments (b, c). 

In a, n=5; inb, ¢, n is as shown. Error bars represent mean + s.e.m. 
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Extended Data Table 1 | Antibodies and quantitative RT-PCR primers used in this study 
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Immunohistochemistry Antibodies 
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Flow Cytometry Antibodies 
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Target Conjugation Host Source Dilution Target Conjugation Source Clone Dilution 
dsRED none rabbit Clontech (632496) 1:500 CD3 PerCP/Cy5.5 Biolegend 17A2 1:100 
RFP biotin rabbit Abcam (ab34771) 1:500 CD19 PerCP/Cy5.5 BD Biosciences 1D3 1:100 
MUC2 none rabbit Santa Cruz (sc-15334) 1:100 CD11B PerCP/Cy5.5 Biolegend M1/70 1:300 
LYS none rabbit Dako (2017-06) 1:1000 CD5 PerCP/Cy5.5 eBiosciences 53-7.3 1:1000 
CHGA none rabbit Immunostar (20085) 1:250 NK1.1 PerCP/Cy5.5 eBiosciences PK136 1:100 
DCLK1 none rabbit Abcam (ab31704) 1:1000 EPCAM PerCP/Cy5.5 Biolegend G8.8 1:300 
PTGS1 none goat Santa Cruz (sc-1754) 1:100 EPCAM AF488 Biolegend G8.8 1:300 
GFI1B none goat Santa Cruz (sc-8559) 1:100 GATA3 AF488 eBiosciences TWAJ 2.5 plitest 
EPCAM AF488 rat Biolegend (G8.8) 1:250 EPCAM APC Biolegend G8.8 1:300 
KI67 AF488 rat Biolegend (16A8) 1:100 CD4 APC BD Biosciences RM4-5 1:100 
rabbit IgG AF488 goat Life Technologies F(ab’)2 1:1000 CD45 BV605 Biolegend 30-F11 1:100 
rabbit IgG AF555 goat Life Technologies F(ab’)2 1:1000 human CD4 PE eBiosciences RPA-T4 5 piltest 
rabbit IgG AF488 chicken Life Technologies F(ab’)2 1:1000 DCLK1 none Abcam 1:1000 
goat IgG AF555 donkey Life Technologies F(ab’)2 1:1000 rabbit IgG AF488 Life Technologies 1:2000 


c 


qRT-PCR Primers 


Target Forward Primer (5' -> 3') Reverse Primer (5' -> 3') 
Delk1 CAAGCCAGCCATGTCGTTC TTCCTTTGAAGTAGCGGTCAG 
Chga ATCCTCTCTATCCTGCGACAC GGGCTCTGGTTCTCAAACACT 
Chat GGCCATTGTGAAGCGGTTTG GCCAGGCGGTTGTTTAGATACA 
Trpm5 ~— TATGGCTTGTGGCCTATGGT ACCAGCAGGAGAATGACCAG 
Muc2 ATGCCCACCTCCTCAAAGAC GTAGTTTCCGTTGGAACAGTGAA 
Gnat3 TAGGAGCCGAGAGGACCAAG GCTGGTATTCAGATGCCCTTTC 
Lyz1 GAGACCGAAGCACCGACTATG CGGTTTTGACATTGTGTTCGC 
Lyz2 ATGGAATGGCTGGCTACTATGG ACCAGTATCGGCTATTGATCTGA 
Ptgs1 ATGAGTCGAAGGAGTCTCTCG GCACGGATAGTAACAACAGGGA 
Gfitb ATGCCACGGTCCTTTCTAGTG GGAAGGCTCTGGTTCAGCAA 


"25 ACAGGGACTTGAATCGGGTC 
Tsip ACGGATGGGGCTAACTTACAA 
1133 GCTGCGTCTGTTGACACATTGAG 


Rps17 CGCCATTATCCCCAGCAAG 


TGGTAAAGTGGGACGGAGTTG 


AGTCCTCGATTTGCTCGAACT 


GGTCTTGCTCTTGGTCTTTTCCAG 


TGTCGGGATCCACCTCAATG 
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Intestinal epithelial tuft cells initiate type 2 
mucosal immunity to helminth parasites 


Francois Gerbe!?*, Emmanuelle Sidot!?, Danielle J. Smyth*+, Makoto Ohmoto”, Ichiro Matsumoto’, Valérie Dardalhon>°, 
Pierre Cesses)2*, Laure Garnier!?*, Marie Pouzolles*°, Bénédicte Brulin!2*, Marco Bruschi!2*, Yvonne Harcus‘, 


Valérie S. Zimmermann*°, Naomi Taylor*°, Rick M. Maizels*+ & Philippe Jay 


Helminth parasitic infections are a major global health and social 
burden!. The host defence against helminths such as Nippostrongylus 
brasiliensis is orchestrated by type 2 cell-mediated immunity’. 
Induction of type 2 cytokines, including interleukins (IL) IL-4 
and IL-13, induce goblet cell hyperplasia with mucus production, 
ultimately resulting in worm expulsion**. However, the mechanisms 
underlying the initiation of type 2 responses remain incompletely 
understood. Here we show that tuft cells, a rare epithelial cell type in 
the steady-state intestinal epithelium’, are responsible for initiating 
type 2 responses to parasites by a cytokine-mediated cellular 
relay. Tuft cells have a Th2-related gene expression signature® 
and we demonstrate that they undergo a rapid and extensive IL- 
4Ra-dependent amplification following infection with helminth 
parasites, owing to direct differentiation of epithelial crypt 
progenitor cells. We find that the Pou2f3 gene is essential for tuft 
cell specification. Pou2f3~'~ mice lack intestinal tuft cells and have 
defective mucosal type 2 responses to helminth infection; goblet 
cell hyperplasia is abrogated and worm expulsion is compromised. 
Notably, IL-4Ra signalling is sufficient to induce expansion of the 
tuft cell lineage, and ectopic stimulation of this signalling cascade 
obviates the need for tuft cells in the epithelial cell remodelling of the 
intestine. Moreover, tuft cells secrete IL-25, thereby regulating type 2 
immune responses. Our data reveal a novel function of intestinal 
epithelial tuft cells and demonstrate a cellular relay required for 
initiating mucosal type 2 immunity to helminth infection. 

Experimental subcutaneous infection of mice with N. brasiliensis 
(Nb) stage 3 larvae induces a typical type-2 response that involves a 
remodelling of epithelial cell populations, with goblet cell hyperplasia 
visible as soon as 5 days post-infection**. Nb L3 larvae first migrate 
from their injection site to the lungs, where they moult to the L4 stage, 
are coughed up, and swallowed to reach the intestines (day 2 post infec- 
tion) where they mature and lay eggs (starting 5 days post-infection). 
Nb induces a rapid and robust type 2 response, resulting in worm expul- 
sion by 6-8 days post infection. 

While the doublecortin-like kinase 1 (Dclk1)-expressing tuft cells 
represent only 0.4% of intestinal epithelial cells in naive mice’, we found 
that Nb infection resulted in a 8.5-fold expansion in tuft cells (Fig. 1a, b), 
first detected by 5 days post-infection in intestinal crypts, where pro- 
liferative epithelial progenitor cells reside, and also in the villi by 7 days 
post infection (Fig. 1c, Extended Data Fig. 1a). The kinetics of tuft 
cell expansion was equivalent to that of goblet cells (Fig. 1d, Extended 
Data Fig. 1b). Neo-differentiated tuft cells were indistinguishable from 
tuft cells present in naive mice, as evaluated by expression of estab- 
lished tuft cell markers, including Dclk1, Sry-related transcription fac- 
tor 9 (Sox9), and phospholipase C gamma 2 (Plcy2) (Extended Data 
Fig. 1c)®®. All tuft cells, characterized by Dclk1 and growth factor 


1,2,3 


independent 1b (Gfilb)* expression also co-expressed the Pou domain, 
class 2, transcription factor 3 (Pou2f3) (Fig. 2a). In addition, rare 
(<3%, n= 400 cells counted) Pou2f3*;Dclk1 or Pou2f3*;Dclk1~ cells 
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Figure 1 | Rapid amplification of the tuft cell lineage following infection 
with Nb. a, Presence of tuft cells in the intestinal epithelia of naive and 
Nb-infected mice 7 days post infection, visualized by expression of the 
Dclk1 marker. b, 8.7-fold increase of tuft cell numbers (1.8 1.4 to 

15.6 + 4.8 per crypt-villus axis) in Nb-infected mice compared to naive 
mice, 7 days post infection. (n = 50 crypt-villus units per mouse; 3 mice 
per condition). Data are shown as means + s.d. (P < 0.0001, two-tailed 
Student's t-test with Welch's correction). c, Changes in the Dclk1-expressing 
tuft cell population in intestinal crypts are presented at the indicated time 
points post infection. Quantification is shown in Extended Data Fig. la. 

d, Corresponding goblet cell hyperplasia associated with numerous and 
larger mucus vacuoles, detected by periodic acid-Schiff (PAS) staining. 
Dclk1 cells are also visualized in brown. Quantification is shown in 
Extended Data Fig. 1b. Scale bars, 201m. All panels show representative 
pictures of experiments replicated 3 times in 3 mice per condition. 
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Figure 2 | Absence of tuft cells in the intestinal epithelium of Pou2f3~/— 
mice. a, Pou2f3 is expressed specifically in tuft cells of the intestinal 
epithelium as determined by co-staining for Pou2f3 and established 
markers of tuft cells such as Dclk1 and Gfilb. b, Pou2f3 deletion results 

in the absence of tuft cells as monitored by staining intestinal epithelium 
from Pou2/3*/* and Pou2f3~/~ mice with Pou2f3-, Dclk1- and Sox9- 
specific antibodies. a, b, Three mice per genotype were used for staining 
experiments. Scale bars, 201m. c, Pou2f3 deficiency does not affect the 
proliferation zone (P= 0.22), stem cell compartment (P = 0.66), enterocyte 
(not counted), goblet (P= 0.83), Paneth (P = 0.60) or enteroendocrine 
(P= 0.23) cell lineages as monitored by Ki67, Olfm4, alkaline phosphatase, 
PAS staining, UEA1 lectin, and Insm1, respectively. (n = 50 crypt-villus 
units per mouse; 3 Pou2f3~/~ and 3 wild-type mice). Data are shown 

as means + s.d. A two-tailed Student’s t-test was used. Pictures show 
representative experiments replicated 3 times. 


were found at the base of crypts, probably representing early differ- 
entiating tuft cells since villus Pou2f3* cells always co-express Gfilb 
and Dclk1. Following infection, the percentage of proliferating tuft 
cells in crypts increased from 13 + 5.6% to 24 + 14.9% (P=0.035), 
indicating that cell proliferation contributes to the amplifica- 
tion of the tuft lineage during type 2 responses (Extended Data 
Fig. 1d, e). Examination of the location of tuft cells present in Nb- 
infected mice revealed that some tuft cells differentiate close to 
the stem cell zone (Extended Data Fig. 1d), suggesting that biased 
differentiation from the recently described Lgr5* slowly cycling 
early secretory progenitors’ and Dll1* secretory progenitors’® 
also contributes to tuft cell lineage amplification. The increase in 
tuft cells was not due to a non-specific amplification of all secretory 
cell lineages as the number of enteroendocrine cells expressing the 
insulinoma-associated 1 (Insm1) marker!!, another secretory line- 
age of the intestinal epithelium, was significantly (P=0.008) reduced 
(Extended Data Fig. 1f, g). 

To determine whether the increase in the tuft cell population follow- 
ing infection with Nb was specific to C57BL/6 mice, we infected BALB/c 
mice and also observed a significant increase in tuft cell numbers 
(14-fold, P< 0.0001; Extended Data Fig. 2a, b). Moreover, this response 
seems to be acommon adaptation to helminth infection in general, as 
infection of C57BL/6 and BALB/c mice strains with Heligmosomoides 
polygyrus'” also resulted in a significant increase in tuft cell numbers 
(6.1- and 8.3-fold, respectively, P < 0.0001; Extended Data Fig. 2c, d). 
Tuft cell hyperplasia following Nb infection also occurred in Rag~/— 
mice (10-fold; P < 0.0001) and therefore does not require functional 
adaptive immunity (Extended Data Fig. 2e, f). 

Epithelial remodelling following helminth infection includes goblet 
cell hyperplasia and changes in mucus composition, associated with 
protective type 2 immunity!*"*. To investigate the role of tuft cells 
in this process, we identified and characterized a tuft-cell-deficient 
mouse line. Mice deficient for the Pou2f3 transcription factor lack all 
Pou2f3-expressing taste receptor cells including sweet, umami and 
bitter taste cells’, as well as Trpm5-expressing chemosensory cells 
in the nasal cavity’ and olfactory epithelium’. Analysis of Pou2f3- 
deficient mice revealed a unique phenotype in the intestinal epithe- 
lium, with a complete absence of tuft cells as assessed by the absence 
of Pou2f3, Dclk1 and Sox9 expression outside the crypt compartment 
(Fig. 2b). The stem cell compartment, proliferation zone, and differ- 
entiation of enterocytes, goblet, enteroendocrine and Paneth cells 
were not affected (Fig. 2c and Extended Data Fig. 3). Furthermore, 
the distribution of immune cells in lymph nodes, mesenteric lymph 
nodes, spleen and lamina propria of Pou2f3*!+ and Pou2f3~/~ mice 
was equivalent (Extended Data Fig. 4) and lymphocytes were capable 
of responding to immune stimulation (Extended Data Fig. 5). Notably, 
type 2 innate lymphoid cells (ILC2s), a lineage that plays a critical role 
in secreting type 2 cytokines in response to helminth infection!®*”, 
were present in both the mesenteric lymph nodes and lamina propria 
of Pou2f3~/~ mice, at levels that were not significantly different from 
wild-type mice. (Extended Data Fig. 6a—c). Therefore, the absence of 
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Figure 3 | Impaired type 2 responses in tuft cell-deficient mice. 

a, Live adult worm counts in the small intestines of wild-type and Pou2f3~/~ 
mice at days 9, 13 and 42 post infection with Nb (n =3 wild-type 

mice and 4 Pou2f3~/~ mice for each time point except for day 9 where 

n=3 Pou2f3-/~ mice). b, Kinetic of Nb infection in 3 wild-type and 4 
Pou2f3~/~ mice, assessed by faecal eggs count. a, b, Each circle represents 
an individual mouse. The x axis indicates time (days) post-infection. 
Average values + s.d. are shown. c, Immunohistochemistry illustrating the 
proximal and distal small intestinal epithelium of infected wild-type and 
Pou2f3~’~ mice 7 days after infection (n =3 mice per genotype). 


Pou2f3 does not affect global immunity or intestinal epithelium for- 
mation. Rather, Pou2f3 represents the first identified transcription factor 
that is specifically required to specify the tuft cell lineage in the intestinal 
epithelium, analogously to Sox9 for Paneth”®! and Ngn3 for enteroen- 
docrine” cell lineages. Thus, Pou2/3~/~ mice represent a powerful model 
to study the function of tuft cells. 

Pou2f3+!* and Pou2f3~/~ mice were infected with Nb and ana- 
lysed at several time points. In Pou2f3*/* mice, only few worms were 
found after 9 days and expulsion was nearly complete after 13 days. In 
sharp contrast, numerous worms were found in Pou2f3~/~ mice up to 
42 days post infection (Fig. 3a, b), not only in the proximal part of 
the small intestine, their normal site of attachment”’, but also in more 
distal locations. Together, these data strongly suggest that a compro- 
mised type-2 response is responsible for prolonged worm survival in 
Pou2f3~’~ tuft-cell-deficient mice. 

To understand the mechanisms underlying the delayed worm 
expulsion in Pou2f3-deficient mice, we analysed the type-2 response- 
dependent remodelling of the intestinal epithelium 7 days after infec- 
tion, a time point at which adult worms were detected in all infected 
animals. In Pou2f3"’* mice, the intestinal epithelium displayed exten- 
sive and generalized goblet cell hyperplasia, with large mucus vacu- 
oles, and tuft cell hyperplasia (Fig. 3c). Expectedly, Pou2f3~/~ mice 
completely lacked tuft cells and, in contrast to Pou2f3*/* mice, were 
devoid of overt goblet cell hyperplasia, with focal and moderate hyper- 
plasia limited to the most proximal small intestine, and lower goblet 
cell numbers than wild-type mice (Fig. 3c, Extended Data Fig. 7a, and 
Supplementary Information 1 and 2). Therefore, tuft-cell-deficient 
mice have a delayed type 2 response, with deficient mucosal goblet cell 
hyperplasia and delayed control of Nb infection. 

The goblet cell-produced Resistin-like beta (Retn18) molecule, 
strongly induced by type 2 cytokines, has direct anti-helminth activ- 
ity that facilitates expulsion***. We compared expression of Retnl$ in 
wild-type and Pou2f3-/~ mice 7 days after Nb infection, when worm 
expulsion had started in wild-type mice. Retn]6 was strongly expressed 
in hyperplastic goblet cells in Pou2f3*/* mice, but was only weakly 
expressed in Pou2/3~’~ mice (Fig. 3c, d and Extended Data Fig. 7a). 
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Dclk1 and PAS stainings, respectively, reveal tuft and goblet cells, as well as 
Retnl8 production. d, Quantification of IL-13 and Retnl@ in the intestinal 
mucosa of naive, and Nb-infected Pou2f3*’* and Pou2f3~/~ mice by 
RT-PCR, 7 days after infection. Representative gels are shown with relative 
Gapdh expression presented as an internal control. e, Histological analysis 
showing tuft (Dclk1 staining) and goblet (PAS staining) cells in naive and 
Nb-infected []4Ra*’* and Il4Ra~/~ mice 7 days post infection (1 =3 mice 
per genotype). Scale bars, 201m. All panels show representative pictures of 
experiments replicated 3 times. 


Moreover, while IL-4 levels were equivalent in mucosal tissue of 
Nb-infected Pou2f3*/* and Pou2f3~/~ mice, IL-13 levels were markedly 
decreased in the latter (Fig. 3d). As both IL-4 and IL-13 type 2 cytokines 
are known to regulate Retnl3 expression’, and IL-4 is dispensable dur- 
ing type 2 responses to Nb”, our data strongly suggest that defective 
IL-13 production is responsible for the decreased Retn18 expression 
in Nb-infected Pou2f3~-’~ mice. Thus, we identify a defective IL-13/ 
Retnl axis in tuft-cell-deficient mice with impaired worm expulsion. 
We next studied the link between tuft cells and type-2-mediated 
mucosal adaptation following Nb infection. IL-4Ra signalling is 
essential for both goblet cell hyperplasia and type 2 immune responses 
occurring upon helminth infection, and deletion of the [/4ra gene abro- 
gates Nb expulsion”>”®. Importantly, the Nb-induced tuft cell hyper- 
plasia occurring in wild-type mice 7 days post infection was absent in 
Tl4roc/~ mice, as was goblet cell hyperplasia (Fig. 3e, Extended Data 
Fig. 7b). This demonstrates the critical role of IL-4Ra signalling in 
the expansion of the tuft cell population following helminth infection. 
We then examined whether IL-4Ra signalling is sufficient to trig- 
ger tuft cell lineage hyperplasia by injecting naive C57BL/6 mice with 
recombinant murine IL-4 and/or IL-13 (rIL-4/rIL-13) for 5 days and 
assessing the histology of the intestinal epithelium. rIL-4/rIL-13 injec- 
tion induced goblet cell hyperplasia together with tuft cell expansion 
(Extended Data Fig. 7c). Importantly, treatment of Pou2/3~/~ mice with 
rIL-4/rIL-13 also resulted in goblet as well as Paneth cell hyperplasia, 
indicating a function of tuft cells upstream of IL-4/IL-13 (Extended Data 
Fig. 7c, d). Moreover, ectopic IL-4/IL-13 induced Retn]$ expression in 
goblet cells, independently of the Pou2f3 genotype. Retnl8 expression 
was found predominantly in crypts and was therefore delayed com- 
pared to the onset of goblet cell hyperplasia (Extended Data Fig. 7c), and 
quantitatively lower than in an infectious context (Fig. 3c). Thus, IL-4Ra 
signalling is sufficient to induce an expansion of the tuft cell lineage. 
Furthermore, ectopic stimulation of this signalling cascade obviates 
the need for tuft cells in the epithelial cell remodelling of the intestine, 
including induction of Retnl expression by hyperplastic goblet cells. 
To determine whether the IL-4/IL-13-induced goblet cell hyperplasia 
was epithelial-cell-autonomous, we used an ex vivo organoid culture 
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cell lineage. Furthermore, these data demonstrate that the intestinal 
epithelial response to IL-4/IL-13 is epithelium-autonomous and does 
not require additional stromal signals. Together, our data identify a 
novel function of tuft cells in initiating the mucosal type 2 responses 
with a positive feedback loop through IL-13-producing immune cells 
that, in turn, amplify the tuft cell lineage. 

Finally, we investigated the physiological function of the tuft cell 
hyperplasia, fully established by 7 days post-infection when worm 
expulsion starts. IL-25 is an alarmin molecule produced by an as yet 
unidentified intestinal epithelial cell type, capable of initiating type 2 
responses by stimulating ILC2s to produce IL-4 and IL-13'*'!°. We thus 
analysed [125 messenger RNA expression in Pou2/3*/* and Pou2f3-/~ 
mice infected with Nb. Nine days after infection, [J25 expression was 
higher in the intestinal mucosa of Pou2/3*/* mice than in tuft-cell- 
deficient Pou23~/~ mice (Fig. 4a). Moreover, IL-25 protein expression 
was restricted to tuft cells in naive mice (Fig. 4b and Extended Data 
Fig. 9a) and consistent with these data, 1/25 mRNA was only detected 
in the FACS-enriched tuft cell fraction of the intestinal epithelium 
(Fig. 4c and Extended Data Fig. 9b). Following Nb infection, IL-25 
expression remained restricted to tuft cells (Fig. 4b). Concomitant 
with tuft cell hyperplasia, epithelial IL-25 expression peaks 9 days 
after infection with Nb, at the time of worm expulsion, for which 
it is required’*. In accord with a critical role for IL-25-secreting 
tuft cells in the expansion of ILC2s, we found that the percentage 
of Lin-CD127*Gata3*KLRGI1* ILC2s was not significantly aug- 
mented by Nb infection of Pou2f3-/~ mice, but was significantly 
augmented in wild-type mice. Indeed, tuft cells were required for 
the global induction of an adaptive immune response as helminth 
infection induced an approximately 2.5-fold expansion of both ILC2 
and Th2 subsets in mesenteric lymph nodes, whereas these sub- 
sets remained unchanged in the infected Pou2f3~/~ mice (P=0.02, 


Figure 4 | Tuft cells express IL-25, and rIL-25 is sufficient to initiate 
type 2 mucosal responses in the absence of tuft cells. a, Analysis of 1/25 
mRNA expression in Pou2/3*/* and Pou2/3~/~ mice infected with Nb, 

9 days post-infection, by RT-PCR. Gapdh expression is presented as an 
internal control. b, Immunohistochemistry showing IL-25 expression 

in naive and Nb-infected wild type mice. Blue staining, nuclear Pou2f3 
expression revealed with NBT/BCIP. Brown staining, IL-25 expression 
revealed with DAB (n =3 naive and 3 infected mice). Scale bars, 20 1m. 

c, PCR with reverse transcription (RT-PCR) showing predominant [/25 
and Pou2f3 mRNA expression in the FACS-enriched tuft cells fractions 
(+) and the other epithelial cells (—), obtained from 3 independent 

mice. Gapdh is shown as an internal control. d, Rescue of the Pou2f3 
deficiency by treatment with exogenous rIL-25, as assessed by egg counts 
during a time course of infection with Nb (n =7 mice for the wild-type 
and Pou2f3~/~ mice, and n=6 for the rIL-25-treated Pou2f3~/~ mice). 
Average + s.d. are presented, as well as exact P values when < 0.05 (two-tailed 
Mann-Whitney U-test). e, Scheme illustrating the function of tuft cells in 
initiating type 2 responses following infection with intestinal helminths. 
Left, normal epithelium undergoing infection with a helminth. Right, tuft 
cell-dependent epithelial remodelling during type 2 responses. All panels 
show representative pictures of experiments replicated 3 times. 


system’ that allows physiological responses of an isolated intestinal 
epithelium to be analysed in the absence of stromal cues. As expected, 
tuft cells were absent in Pou2f3~/~ organoid cultures (Extended Data 
Fig. 8a). Moreover, in wild-type organoids, the tuft cell population 
increased as early as 48 h following addition of rIL-4/rIL-13 (Extended 
Data Fig. 8a, b). Treatment with rIL-4 or rIL-13 alone yielded identical 
results to the rIL-4/rIL-13 mixture (Extended Data Fig. 8c). Treatment 
of Pou2f3~/~ organoids with rIL-4/rIL-13 also triggered goblet cell 
hyperplasia equivalent to that detected in Pou2f3*’* organoids, as 
indicated by Retnl$ expression (Extended Data Fig. 8d), revealing 
the critical role of type 2 cytokine signalling downstream of the tuft 


P=0.0005, respectively; Extended Data Fig. 6d-f). It is likely that 
these immune defects are directly due to the paucity of IL-25 as 
treatment of Nb-infected Pou2f3~’~ mice with rIL-25 almost com- 
pletely compensated for the absence of tuft cells, promoting an effi- 
cient worm expulsion (Fig. 4d). IL-25 thus provides a mechanistic 
link between tuft cells, promotion of type 2 responses and worm 
expulsion. 

Taken together, our data reveal a critical function of tuft cells in ini- 
tiating mucosal type 2 responses following infection with helminths 
through IL-25 secretion. In the absence of tuft cells, IL-25 and IL-13 
expression remain low, and type 2 mucosal responses and worm expul- 
sion are delayed. Our study demonstrates a requirement for tuft cells 
upstream of IL-4/IL-13, with these cytokines driving tuft cell hyperpla- 
sia, thereby amplifying a feed-forward loop to orchestrate a rapid and 
effective anti-helminth immunity (Fig. 4e). 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 

Animal strains. The Pou2f3-deficient mice (Pou2f3tm1Abek) have been previously 
described". [l4ra-deficient mice”? were provided by M. Kopf (Basel Institute for 
Immunology, Switzerland). C57BL/6 and BALB/c mice were obtained from Charles 
River Laboratories. All the mice were maintained in an SPF animal facility and 
were naive before the experiments. All animal experiments were approved by the 
Institutional Animal Care and Use Committee of Monell Chemical Senses Center 
or by the French Agriculture and Forestry Ministry. Unless specified, all mice 
were on a C57BL/6 genetic background. Mice were analysed at 10 weeks of age, 
regardless of the sex. For comparisons of wild-type and KO mice, littermates were 
analysed. Three mice per condition were analysed in all experiments. No statistical 
methods were used to predetermine sample size, no criteria of exclusion were 
defined and the experiments were not randomized nor blinded to the investigator. 
Immunophenotyping and flow cytometry analyses. Cells, isolated from periph- 
eral lymph nodes, mesenteric lymph nodes, spleen and lamina propria were stained 
with Sytox blue or Live/dead fixable viability dye (Life Technologies and eBiosci- 
ence respectively) together with the appropriate conjugated anti-CD3, CD45.2, 
CD62L, CD4, CD8, CD44, TCRy$, CD19, NK1.1, Grl, CD11b, CD1 1c, and Foxp3 
antibodies (eBioscience or Becton Dickinson, San Diego, CA). For ILC2 staining, 
cells were stained with a lineage cocktail and CD45.2* (clone 104) and lineage-neg- 
ative CD45* cells were assessed for expression of CD127 (clone $B14), KLRG1 
(clone 2F1), Sca-1 (clone D7), CD25 (clone 7D4), and intracellular expression of 
Gata-3 (clone L50-823). Th2 cells were identified on the basis of Gata-3-expressing 
cells within the CD3*CD4* subset. 

IL-6, IL-12, TNFQ, IL-10, MCP-1 and IFN-7 production was assessed in the cul- 
ture supernatant of LPS/IL-4-activated splenocytes using a Cytometric Bead Array 
(CBA) Kit (BD Biosciences). To assess intracellular cytokine production, freshly 
isolated and anti-CD3/CD28 stimulated LN cells were activated with PMA (Sigma- 
Aldrich; 100ngml~')/ionomycin (Sigma-Aldrich; 1 xg ml‘) in the presence of 
brefeldin A (Sigma-Aldrich; 101g ml~?) for 4h at 37°C, fixed, permeabilized and 
then stained with specific antibodies against IL-2 and IFN-,. 

Foxp3 staining was performed following fixation/permeabilization 
(eBioscience). Stained cells were assessed by flow cytometry (LSR Fortessa, 
Becton Dickinson, San Jose, CA) and data were analysed by FACSDiva (v.8.0, BD 
Biosciences) and FCAP Array Software (CBA analysis). 

Ex vivo stimulations. LN cell activation was performed using plate-bound 
anti-CD3 (clone 2C11; 1jigml~!) and anti-CD28 (clone PV-1 1,1gml~!) mAbs 
in RPMI 1640 media (Life Technologies) supplemented with 10% FCS, 2mM glu- 
tamine and 1% penicillin/streptomycin. Exogenous IL-2 (100 Uml!) was added 
every other day starting at day 2 post-activation. Cell proliferation was monitored 
by labelling with CFSE (Life Technologies; 2.5 1M) for 3 min at room temper- 
ature. Splenocytes were activated with LPS (201g ml~!) and IL4 (25ngml~). 
Supernatants were collected 40h following activation. 

Immunoglobulin detection. IgG detection in supernatants of LPS/IL-4-stimulated 
splenocytes was assayed by ELISA. Microtiter plates (Maxisorb, Nunc) were satu- 
rated overnight at 4°C with 100 1 of anti-IgG2a, anti-IgG2b, anti-IgA antibodies 
or anti-IgG (Fab’2) resuspended in PBS (5,1gml~!). Plates were washed 3 times 
with 0.1% Tween-containing PBS (PBST). Samples (1/2 dilution) were diluted 
in a final volume of 100,11 per well of PBST-1%BSA and incubated for 2h at RT. 
Following washes, peroxidase-conjugated anti-mouse anti-IgG2a, anti-IgG2b, 
anti-IgA (Serotech) or anti-IgG gamma-chain (SIGMA) antibodies were added 
in PBST-1%BSA (1:1,000 dilution; 10011 per well) and incubated for a 1h at 37°C. 
Immunoglobulin levels were then revealed with o-phenylenediamine (Sigma; 
4g ml!) in 0.1 M Na citrate and 003% hydrogen peroxide. Absorbance was 
measured at 450 nm using an automated plate reader (InfiniteM200Pro, TECAN) 
after 5 min at room temperature. 

Tuft cell sorting. Freshly isolated small intestines of BC57BL/6 mice were flushed 
with PBS and incised along their length. The tissue was then incubated in 30 mM 
EDTA (Sigma) in HBSS pH 7.4 (Life Technologies) on ice, and transferred in 
DMEM (Life Technologies) supplemented with 10% FBS (Sigma). Vigorous shak- 
ing yielded the epithelial fraction that was then incubated with 100 il of Dispase 
(BD Biosciences) in 10 ml of HBSS, supplemented with 10011 of DNase I at 2,000 
Kunitz (Sigma). Single cell preparation obtained by filtration on a 301m mesh was 
incubated with a phycoerythrin rat anti-mouse Siglec-F antibody for 30 min at 
4°C (BD pharmigen 552126), washed with HBSS and resuspended in appropriate 
volume of HBSS pH 7.4 supplemented with 5% FBS before staining with 7-amino- 
actinomycin D (Life Technologies) to exclude dead cells. Siglec-F* live cells were 
sorted using a FACSAria (Becton Dickinson), directly in RLT lysis buffer (Qiagen) 
for subsequent RNA extraction. 

Parasite infections. Pou2f3*/ +, Pou2f3 /, Il4ra‘’* and Il4ra~’~, C57BL/6 and 
BALB/c wild-type mice were used for Nb infection experiments. Mice were 
infected with 250 L3 infective Nb larvae by sub-cutaneous injection”? or with 200 
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H. polygyrus L3 larvae by gavage. Infection parameters were monitored by enumer- 
ation of live adult worms in the small intestinal tissue by two different investigators 
blinded to the study groups. 

Reagents. Recombinant murine IL-4 (214-14), recombinant murine IL-13 
(210-13) were purchased from PeproTech, and recombinant murine IL-25 (1399) 
was from R&D Systems. For animal treatment, mice were injected intraperitone- 
ally daily with a mixture of both interleukins or with rIL-25 (401g per kg of body 
weight). For rescue experiments in Pou2f3~/~ mice, rIL-25 was injected from day 3 
post infection. 

Organoid culture. Organoid cultures were performed as previously described”” 
using intestinal crypts from Pou2f3*/+ and Pou2f3~/~ mice. Organoid lines 
were passaged up to 10 times before experiments to ensure pure epithelial 
cultures. When indicated, cultures were stimulated with recombinant murine IL-4 
(400 ng ml~!), recombinant murine IL-13 (400 ng ml~!) or an equimolar mixture 
of the two cytokines. For histological studies, organoids were washed twice in cold 
PBS to dissolve the Matrigel, fixed overnight in neutral-buffered formalin at 4°C 
and included in Histogel (Thermo Scientific) before paraffin embedding (n= 3 
experiments from independent mice). 

RNA extraction and PCR. Total RNA from intestinal organoids or snap-frozen 
intestinal tissues was isolated using TRIzol (Life Technologies) followed by pre- 
cipitation with isopropanol. Further RNA purification was carried out on RNeasy 
columns (Qiagen, 74104) and DNase treatment. In the case of Siglec* sorted tuft 
cells, extraction and DNase treatment were performed using RNeasy Micro KIT 
following the manufacturer’s instructions. Reverse transcription was performed 
with 500 ng-2 1g of purified RNA using Transcriptor First Strand cDNA synthesis 
KIT (Roche) according to the manufacturer’s instructions. For (RT-PCR exper- 
iments, gene expression was quantified on the LightCycler 480 using LightCycler 
480 SYBR Green I Master (Roche). The results from three independent organoid 
cultures were normalized to the expression level of Gapdh and Hprt and relative 
expression was obtained using the AAC, method. Primer sets for each gene are 
listed in Extended Data Table 1. PCR analyses were performed on an Eppendorf 
Mastercycler, using the primer sets listed in the Extended Data Table 1. 

In situ hybridization. Single colorimetric and double fluorescent in situ hybridiza- 
tion analyses were carried out as described previously”. Briefly, digoxigenin- and 
fluorescein-labelled antisense RNAs were synthesized and used as probes after 
fragmentation to about 150 bases under alkaline conditions. Small intestines were 
dissected from mice shortly after euthanasia and embedded in frozen O.C.T. com- 
pound (Sakura Finetech). Fresh-frozen sections were prepared using a cryostat 
(CM1900, Leica Microsystems), fixed with 4% paraformaldehyde, hybridized 
with probe(s), and then washed under stringent conditions. Hybridized probes 
were immunohistochemically detected using alkaline phosphatase-conjugated 
anti-digoxigenin antibody (Roche Diagnostics) and biotin-conjugated anti- 
fluorescein antibody (Vector Laboratories). Signals were developed using 4-nitro 
blue tetrazolium chloride/5-bromo-4-chloro-3-indolyl-phosphate as chromogenic 
substrates for single colorimetric analyses or the Tyramid Signal Amplification 
method and HNPP Fluorescent Detection Set (Roche Diagnostics) for double- 
fluorescent analyses. Stained and fluorescent images were obtained on a Nikon 
eclipse 80i microscope (Nikon Instruments Inc.) equipped with a DXM1200C 
digital camera (Nikon) and a Leica SP2 confocal scanning microscope (Leica), 
respectively. RNA probes generated were as follows: nucleotides 72-2363 of 
Pou2f3 (GenBank accession number NM_011139), nucleotides 1-2228 of Slc15al 
(GenBank accession number BC116248), nucleotides 1-3255 of Muc2 (GenBank 
accession number BC034197), nucleotides 1-1102 of Gcg (GenBank accession 
number BC012975), nucleotides 1-584 of Gip (GenBank accession number 
BC104314), nucleotides 27-400 of Defcr6 (GenBank accession number M33225), 
nucleotides 1-1628 of Olfm4 (GenBank accession number BC141127), nucleotides 
1-2750 of Dclk1 (GenBank accession number BC050903), and nucleotides 1-2797 
of Ptgs1 (GenBank accession number BC005573). 

Fluorescent and bright-field immunohistochemistry on paraffin-embedded 
tissue. Tissue dissection, fixation, and immunohistochemistry on thin sec- 
tions of paraffin-embedded tissue were performed essentially as described 
previously®. Primary antibodies used in this study were as follows: anti-Sox9 
(AB5535; Millipore), anti-Cox1 (sc-1754; Santa Cruz), anti- PCNA (sc-56; Santa 
Cruz), anti-Plcy2 (sc-5283, Santa Cruz), anti-Gfilb (Sc-8559; Santa Cruz), anti 
Pou2f3 (sc-330, Santa Cruz and HPA019652, Prestige Antibodies), anti-Dclk1 
(ab31704; AbCam), anti-Ki67 (ab16667; AbCam), anti-Retnl3 (ABIN465494, 
Antibodies online), anti-IL-25 (mAb 1258; R&D Systems). Anti-Insm1 was a 
gift from C. Birchmeier (Max-Delbriick-Center for Molecular Medicine; Berlin; 
Germany). Slides were washed twice times with 0.1% PBS-Tween (Sigma-Aldrich) 
before incubation with fluorescent secondary antibodies conjugated with either 
Alexa 488, cyanin-3, or cyanin-5 (Jackson ImmunoResearch Laboratories, 
Inc.) and Hoechst at 2,.g ml“! (Sigma-Aldrich) in PBS-Triton X-100 0.1% 
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(Sigma-Aldrich). Stained slides were washed again in PBS before mounting with 
FluoroMount (Sigma-Aldrich). Methods used for bright-field immunohistochem- 
istry were identical, except that Envision+ (Dako) was used as a secondary reagent. 
Signals were developed with DAB (Sigma-Aldrich) and a haematoxylin counter- 
stain (DiaPath) was used. After dehydration, sections were mounted in Pertex 
(Histolab). All experiments were performed on formalin-fixed tissues and 10mM 
sodium citrate (pH 6.4) treated slides, except for IL-25 staining where Carnoy’s 
fixation and 10 mM Tris-EDTA (pH 9.0) treatment were used. Enterocytes-alkaline 
phosphatase activity was revealed with Fast-red substrate (Sigma-Aldrich). All 
stainings were repeated in 3 mice per group in 3 independent experiments and 
scored by three different investigators blinded to the study groups. 

Microscopy and imaging. Fluorescent pictures were acquired at room tempera- 
ture on an AxioImager Z1 microscope (Carl Zeiss, Inc.) equipped with a camera 
(AxioCam MRm; Carl Zeiss, Inc.), EC Plan Neofluar (5x, NA 0.16; 10x, NA 0.3; 
20x, 0.5 NA; 100x, NA 1.3) and Plan Apochromat (40x, NA 0.95; 63x, NA 1.4) 
objectives, the Apotome Slider system equipped with an H1 transmission grid (Carl 
Zeiss, Inc.), and AxioVision software (Carl Zeiss, Inc.). Bright-field immunohisto- 
chemistry pictures were taken at room temperature on an Eclipse 80i microscope 
(Nikon) with Plan Fluor (10x, NA 0.3; 20x, NA 0.5; 40x, NA 0.75; and 60x, 
NA 0.5-1.25) lenses (Nikon) and a digital camera (Q-Imaging Retiga 2000R with 
a Q-Imaging RGB Slider). Pictures were captured with Q-Capture Pro software 
(Nikon). Post-treatment of pictures (level correction), annotations, and panel com- 
position were performed using Photoshop software (Adobe). 


Statistical analyses. The Prism software was used for descriptive statistical anal- 
yses. For infection monitoring, sample (n) was defined as the number of eggs per 
gram of faeces per mouse. As normal distribution assumption was not met, a two- 
tailed Mann-Whitney U-test was used to calculate the P value. For histological data 
quantification, sample (1) was defined as the number of cells per crypt-villus unit. 
Unless otherwise stated, 50 crypt-villus axes were counted per histological sections 
from 3 mice of each genotype or condition. According to the central limit theorem 
(n> 30), data comparison was achieved with a two-tailed Student's t-test. Welch's 
correction was applied to P-value calculation when homoscedasticity assumption 
was not met (determined with F-test). 

Results are shown as histograms representing means as centre values and stand- 
ard deviation as error bars for each genotype or conditions, except when n <5, 
where individual data point are plotted. 

No statistical methods were used to predetermine sample size and the experi- 
ments were not randomized. Unless otherwise stated, the investigators were not 
blinded to allocation during experiments and outcome assessment. 
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Extended Data Figure 1 | Epithelial differentiation parameters during 
Nb infection. a, Graph showing the distribution of Dclk1* tuft cells in 
naive and infected mice 4, 5 and 7 days post infection. Cells were counted 
in the crypt and villus compartments of n = 50 crypt-villus units per 
mouse with 3 mice per condition. Means of villus/crypt ratio of tuft cell 
numbers are shown. b, Quantification of the goblet cell hyperplasia in 
naive and infected mice 4, 5 and 7 days post infection (m= 50 crypt-villus 
units per mouse; 3 mice per condition). c, Neo-differentiating tuft cells 
following Nb infection are indistinguishable from the tuft cells found in 
naive mouse intestinal epithelium as shown with Sox9 and Plc-2 stainings 
(n=3 mice). d, Proliferation status of Pou2f3* tuft cells in naive and 
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infected mice, shown with co-expression with the Ki67 proliferation 
marker. Arrows indicate Ki67* cells located at various positions along the 
crypt axis. e, Increased proliferation of Pou2f3* tuft cells during response 
to Nb infection (n = 3 naive and 3 infected mice). f, Dclk1* tuft cells and 
Insm1* enteroendocrine cells are distinct populations (n =3 mice). 

g, Decrease of the Insm1* enteroendocrine cell population during type 

2 responses to Nb infection, concomitant to the expansion of the tuft cell 
lineage 7 days post infection (n = 3 naive and 3 infected mice). All the 
histograms show means + s.d. A two-tailed Student's t-test with Welch's 
correction was used, except for g where the 2 groups displayed comparable 
variances. All stainings were repeated 3 times. 
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Extended Data Figure 2 | Expansion of the tuft cell lineage is a H. polygyrus-infected C57BL/6 and BALB/c mice. e, f, Naive and 
common adaptation of the intestinal epithelium following infection Nb-infected C57BL/6 and Rag~’~ mice. b, d, f, n= 50 crypt-villus units 
with helminth parasites. Tuft cell lineage expansion was assessed per mouse; 3 mice per condition. Data are shown as means + s.d. and 
by Dclk1 immunohistochemistry in 2 different genetic backgrounds P values are indicated. A two-tailed Student's t-test with Welch’s 
following infection with two different helminths, at the indicated time correction was used. Scale bars, 201m. All experiments displayed in this 
points. a, b, Naive and Nb-infected BALB/c mice. c, d, Naive and figure were repeated 3 times. 
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Extended Data Figure 3 | Pou2f3 deficiency results in the specific immunohistochemistry experiments underlying the quantitative analysis 
absence of tuft cells in the intestinal epithelium. Characterization of provided in Fig. 2c showing that the stem cells (Olfm4), proliferative 
the intestinal epithelium in Pou2f3-deficient mice as compared with compartment (Ki67), and differentiated cell types: enterocytes (alkaline 
wild-type littermate controls (n =3 mice of each genotype). Left, in situ phosphatase), Paneth (UEA1), enteroendocrine (Insm1) and goblet 
hybridization showing absence of tuft cells (Pou2/3, Cox1) in Pou2f3~/~ (PAS staining) cells populations are unaffected in the Pou2f3~/~ mice. 
mice, whereas enterocytes (Slc5a1), goblet cells (Muc2), Paneth cells All panels show representative pictures of experiments replicated 3 times 
(Defcr6) and enteroendocrine cells (glucagon, Gip) are unaffected. in 3 different mice. Scale bars, 201m. 


Right, representative pictures of the in situ hybridization (Olfm4) and 
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Extended Data Figure 4 | Immune cell homeostasis is not altered in 
Pou2f3~'~ tuft-cell-deficient mice. a, The repartition of immune cells 


in wild-type and Pou2f3~/~ mice was monitored by flow cytometry. 


The presence of T (CD3*), B (CD19*), CD4*, CD8*, naive (Tn; CD3* 
CD62L*CD44_), central memory (Tem; CD3*CD62L*CD44°), effector 
memory (Tem; CD3+CD62L~ CD44"), regulatory (Treg; CD4*Foxp3*), 


natural killer (NK; CD3~ NK1.1*) and myeloid (CD11b*Gr1°*) cells 
was assessed by staining with fluorochrome-tagged antibodies and 


representative dot plots are shown. The percentages of positively-stained 


cells are indicated. b, Quantification of the different immune cells in 


lymph nodes (LN), mesenteric lymph nodes (mLN) and spleens 


(SP) 
ts.d. 


of wild-type and Pou2f3~/~ mice are presented. Data are means + 


CD1ic 


(n=3 mice per genotype). c, Total cells in LN, mLN and SP of wild-type 
and Pou2f3~/~ mice are presented as means + s.d. (n=3 mice per 
genotype). d, Immune lineage cells in the lamina propria of wild-type 

and Pou2f3~/~ mice were monitored by flow cytometry after tissue 
dissociation. The percentage of T cells was assessed within the CD45* 
haematopoietic gate, CD4, CD8 and gamma-delta T cells (CD8*TCR-48*) 
within the CD3* gate and myeloid cells within the CD3~ gate, as indicated. 
Representative dot plots are presented (left). Quantification of immune 
cells within the lamina propria are shown as means + s.d. (n =3 mice per 
group). No significant differences were detected for all cell types between 
wild-type and Pou2f3~/~ mice (P > 0.05). A two-tailed Student's t-test was 
used. 
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Extended Data Figure 5 | Equivalent immune responsiveness of histograms for CD4 and CD8 T cells are shown. c, IFN-7 production in 
wild-type and Pou2f3~/~ lymphocytes. a, The level of IL-2 and interferon —_ wild-type and Pou2f3~/~ lymphocytes was assessed at day 6 post CD3/CD28 
gamma (IFN-7) production by Pou2f3*’* and Pou2f3~/~ CD4 and stimulation and representative plots for CD4 and CD8 T cells are 
CD8 lymph node T cells was monitored directly after ex vivo isolation presented. d, Splenocytes from wild-type and Pou2f3~/~ mice were 
and representative histograms are presented (left). Quantification of activated with LPS+IL-4 for 40h and levels of secreted IgG, IgG2a, 
cytokine secreting CD4 and CD8 T cells are presented as means + s.d. IgG2b and IgA were monitored by ELISA. Means + s.d. are presented. 
(n= 3 per group; P > 0.05). b, CFSE-loaded T cells were activated with e, Splenocytes were activated as above and levels of TNF-a, IFN-y, 
immobilized anti-CD3/anti-CD28 antibodies for 2 days and proliferation MCP-1, IL-10, IL-6, and IL-12 were monitored by cytometric bead array. 
was monitored as a function of fluorescence dilution. Representative Means + s.d. are presented. A two-tailed Student's t-test was used. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Live/ dead 


Mesenteric Lymph Nodes (mLN) 


Lin-CD127* GATA-3* KLRG1* 


Lamina Propria (LP) 


Naive 


KLRG1 


KLRG1 


GATA-3 


Lamina Propria (LP) 


5 
Pa 
+ 4 
8 
3 6° 
3 x 
£ =e 
2 
0 
WT KO 
e 
0.06 % 6000 - 
— £ 
aS 2 
mi 0.04 9 4000 
iS) 2 
= & 
0.02 & 2000 + 
2 
0 0 
WT WTinf KO KOinf WT WTinf KO KOinf 
5 
2 4 
3S 5 eK 
g¢ 
aE? 
oF 
oo4 
ft) 
; ILC2 Th2 a ILC2 Th2 
WT KO 


Extended Data Figure 6 | Defective induction of type 2 immunity in 
Pou2f3~’~ mice following helminth infection. a, Flow cytometry gating 
strategy for analysis of the innate ILC2 subset is shown. ILC2s were 
assessed within the CD45+ haematopoietic subset as lineage-CD127* 
cells expressing KLRG1, GATA-3, Sca-1 and CD25 cell surface markers. 
Numbers represent the percentages of boxed cells. The staining strategy 
was validated using mLN cells from ZAP-70~'~ mice as this subset 

is present at relatively high levels in these immunodeficient mice. 

b, The presence of ILC2 cells in mLNs of naive Pou2f3*’ * (WT) and 
Pou2f3~/~ (KO) mice was assessed using the gating strategy shown 
above. Representative data from WT (n= 8) and KO (n=5) mice are 
presented. c, Representative plots of ILC2 cells in lamina propria of naive 


WT (n=7) and KO (n=5) mice are shown (top). Quantifications of 
ILC2 are presented as means +s.d. d, WT and KO mice were infected 
with N. brasiliensis and the presence of ILC2 in mLNs was assessed 5 days 
post infection. Representative plots are shown (n =6 mice per group). 

e, Quantification of ILC2 cells in mLN of naive versus infected WT and 
KO mice. The percentage of ILC2s within the live gate (left) and the absolute 
numbers of ILC2s (right) are presented. Data are means +s.d. (n =5 for 
WT, n=8 for KO, n=6 for both groups of infected mice). **P=0.01. f, 
The fold-increase in ILC2 (lineage” CD127*KLRG1*GATA-3*) and Th2 
(CD3*CD4* Gata-3*) cells in mLN was assessed as a function of infection 
(n=6 per group). The mean fold-increase + s.d. in WT and KO mice is 
presented. *P= 0.02, ***P=0.0005. A two-tailed Student’s t-test was used. 
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Extended Data Figure 7 | Signalling via IL-4Ro is required and 
sufficient to induce goblet and tuft cell hyperplasia. a, Quantification 
of goblet (PAS and Retnl6 staining) cells in Pou2/3*/* and Pou2f3~/~ mice 
infected with Nb (day 7 post infection). In Pou2f3~/~ mice, crypt-villus 
axes from both focally responding regions and the rest of the tissue were 
counted. b, Quantification of tuft cells (Dclk1 staining) and goblet cell 
hyperplasia (PAS) in I/4ra*’* and Il4ra~/~ mice. c, Histological analysis 
showing tuft (Dclk1 staining) and goblet (PAS and Retnl8 staining) cells 
in Pou2f3*/* and Pou2f3~’~ mice following treatment with a mixture 
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of rIL-4 and rIL-13 for 5 days. n=3 mice per condition. All panels 

show representative experiments replicated 3 times. Scale bars, 201m. 

d, Quantitative analysis of the changes in the different cell types of the 
intestinal epithelium of Pou2f3*/* and Pou2f3~’~ mice following treatment 
with a mixture of rIL-4 and rIL-13 during 5 days. For a, b, d, n= 50 
crypt-villus axes counted in 3 mice per genotype or condition. Data are 
shown as means + s.d. and P values are indicated. A two-tailed Student’s 
t-test with Welch’s correction was used. 
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Extended Data Figure 8 | Signalling via IL-4Ra is sufficient to induce 
goblet and tuft cell hyperplasia in mouse intestinal organoids. 

a, Quantification of Dclk1 expression analysis by qRT-PCR in Pou2/3*/* 
and Pou2f3~/~ organoids following rIL-4/rIL-13 treatment for 48 h to 
assess the presence and amplification of tuft cells. Means +s.d., relative 
to Gapdh and Hprt, are presented. b, Expansion of the tuft cell lineage 

in wild-type organoids following rIL-4/rIL-13 administration (48 h) 


was monitored by Dclk1 staining. c, Expansion of the tuft cell lineage in 
wild-type organoids following IL-4 or IL-13 administration (48 h) was 
monitored by Dclk1, Pou2f3 and PAS stainings. Scale bars, 201m. 

d, Retnl@ expression in Pou2f3*/* and Pou2f3~/~ organoids was monitored 
as a function of rIL-4/rIL-13 treatment (48h) by RT-PCR and data relative 
to Gapdh are presented. All panels show representative experiments from 
3 independent organoid cultures, replicated 3 times. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Pou2f3 IL25 Pou2f3 No IL25 primary antibody b Dclk1 SiglecF Hoechst 


——_—_—__—— 


SiglecF 


Delk1 


b, Immunohistochemistry showing specificity of Siglec-F as a marker for 
a, Control experiment for specificity of the IL-25 immunohistochemistry, intestinal epithelial tuft cells. All panels show representative experiments 
in presence (left) or absence (right) of IL-25 primary antibody. from 3 independent mice, replicated 3 times. 
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Extended Data Table 1 | List of the oligonucleotide primer sequences 


Table1: Primer sets used for qPCR analyses (5’ 3’): 


Delk1 CAGCCTGGACGAGCTGGTGG TGACCAGTTGGGGTTCACAT 


Gapdh GGAGCGAGACCCCACTAACA ACATACTCAGCACCGGCCTC 
GCAGTACAGCCCCAAAATGG GGTCCTTTTCACCAGCAAGCT 


Table 2: Primer sets used for PCR analyses (5’ 3’): 


IL13 AGCTCCCTGGTTCTCTCACT CTCATTAGAAGGGGCCGTGG 
IL25 TCTTGGCAATGATCGTGGGA TGTGGTAAAGTGGGACGGAG 
Retnib CCAGAAGACCATTTCCTGAGCT | CCCACGATCCACAGCCATAG 


Gapdh CAAGAAGGTGGTGAAGCAGG TCAAGAGAG TAGGGAGGGCT 
Pou2f3 AGAGAATCAACTGCCCCGTG GGAAGGCACGACTCTCTTCC 
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Crystal structure of a DNA catalyst 


Almudena Ponce-Salvatierra!*, Katarzyna Wawrzyniak-Turek!, Ulrich Steuerwald?, Claudia Hobartner!* & Vladimir Pena? 


Catalysis in biology is restricted to RNA (ribozymes) and protein 
enzymes, but synthetic biomolecular catalysts can also be made of 
DNA (deoxyribozymes)! or synthetic genetic polymers’. In vitro 
selection from synthetic random DNA libraries identified DNA 
catalysts for various chemical reactions beyond RNA backbone 
cleavage*. DNA-catalysed reactions include RNA and DNA ligation 
in various topologies**, hydrolytic cleavage® and photorepair of 
DNAS, as well as reactions of peptides®'° and small molecules!!!. 
In spite of comprehensive biochemical studies of DNA catalysts 
for two decades, fundamental mechanistic understanding of their 
function is lacking in the absence of three-dimensional models at 
atomic resolution. Early attempts to solve the crystal structure of an 
RNA-cleaving deoxyribozyme resulted in a catalytically irrelevant 
nucleic acid fold'?. Here we report the crystal structure of the RNA- 
ligating deoxyribozyme 9DB1 (ref. 14) at 2.8 A resolution. The 
structure captures the ligation reaction in the post-catalytic state, 
revealing a compact folding unit stabilized by numerous tertiary 
interactions, and an unanticipated organization of the catalytic 
centre. Structure-guided mutagenesis provided insights into 
the basis for regioselectivity of the ligation reaction and allowed 
remarkable manipulation of substrate recognition and reaction rate. 
Moreover, the structure highlights how the specific properties of 
deoxyribose are reflected in the backbone conformation of the DNA 
catalyst, in support of its intricate three-dimensional organization. 
The structural principles underlying the catalytic ability of DNA 
elucidate differences and similarities in DNA versus RNA catalysts, 
which is relevant for comprehending the privileged position of 
folded RNA in the prebiotic world and in current organisms. 

The deoxyribozyme 9DB1 catalyses the regioselective formation 
of a native phosphodiester bond between the 3’/-hydroxyl and the 
5/-triphosphate termini of two RNA strands, using divalent metal ions 
(Mg?* or Mn?*) as cofactors'*!°. Ribozymes that catalyse an analogous 
ligation reaction are found in nature’®, and have also been generated by 
in vitro selection from random RNA libraries'”’* or a natural ribozyme 
scaffold’’. The 9DB1 DNA enzyme is composed of a central catalytic 
domain flanked by two arms that hybridize to the RNA substrates by 
canonical Watson-Crick base pairing (Fig. 1a, b), leaving only the two 
nucleotides embracing the ligation junction unpaired. The in vitro 
selection strategy used to obtain 9DB1 imposed selection pressure for 
base pairing outside the ligation junction and enforced 3’,5’ selectivity 
of the reaction". 

We crystallized a 44-nucleotide strand that contains the mini- 
mally required catalytic core of 31 nucleotides”® in complex with a 
15-nucleotide RNA strand. The structure was solved by single anoma- 
lous dispersion using data collected from hexamine cobalt (III) chloride. 
The model built in the experimental map was refined against a 2.8 A 
resolution data set collected from a native crystal (Fig. 1c). The crystal 
contains one molecule in the asymmetric unit and the final model com- 
prises all nucleotides present in the complex (Extended Data Table 1). 

The structure exhibits the shape of a letter, consisting of the 
two RNA-DNA hybrid duplexes, P1 and P4, oriented at approxi- 
mately 120° with respect to each other, and both tightly anchored to 


the catalytic domain (Fig. 1d, e). The RNA nucleotides A—1 and G1 
encompass the ligation junction and form extensive tertiary contacts 
with the DNA catalyst (Fig. 1d, e). The 3’-overhanging DNA and 
RNA nucleotides form a semi-continuous duplex in the crystal lattice 


Ligated RNA 
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Figure 1 | Global architecture of the DNA catalyst. a, DNA-catalysed 
ligation between 3’-OH and 5/-phosphate. nt, nucleotides. b, Proposed 
secondary structure of the minimal 9DB1 deoxyribozyme”’. RNA 
nucleotides are in blue, ligation junction in light blue, DNA-binding arms 
in grey, and core nucleotides in black and red. Nucleotides in red do not 
tolerate any mutation. Blue and green lines indicate double pseudoknot 
interactions. c, Solvent-flattened electron density map contoured at 

1.0c level with orange trace of backbone and nucleobases. d, Secondary 
and tertiary structure with base pairs denoted in Leontis—Westhof 
presentation. DNA nucleotides in black have N-type ribose pucker, 

those in green have S-type conformations; grey nucleotides lie outside 
typical N or S conformations. P1-P5 are colour coded with respect to the 
structure shown in e. Inset: schematic overview of the double pseudoknot. 
e, Cartoon representation of the deoxyribozyme in complex with the 
ligated RNA product. f, A 60° rotation of the image shown ine. 


1Max Planck Research Group Nucleic Acid Chemistry, Max Planck Institute for Biophysical Chemistry, Am Fassberg 11, 37077 Gottingen, Germany. @Research Group Macromolecular 
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Figure 2 | Tertiary contacts within the catalytic domain. a, Multiple 
stacking and base pairing contacts in the 9DB1 core. b, c, Close-up view of 
multiplets. Black dashes indicate hydrogen bonds. 


(Extended Data Fig. 1). Within the catalytic domain, two stacks of two 
and four base pairs are formed, called P2 and P3, respectively. The 
paired regions P1, P2 and P3 are positioned on top of one another in 
a non-coaxial manner (Fig. 1f). In addition, the DNA residues dT29- 
dT30 base pair with the RNA nucleotides A—1-G1 (denoted P5), which 
makes the entire DNA-RNA complex fold as a double pseudoknot 
(Fig. 1d). 

The overall fold of the catalytic domain is stabilized by tertiary 
interactions that connect the regions labelled J1/2, J3/2, P2 and P3 
in four successive layers of multiplets (Fig. 2a). Hydrogen bonding 
forms the long-range Watson-Crick base pair dC12:dG26 and the 
non-canonical Watson-Crick base pair dT14:dG25 in P2 (Fig. 2b, c), 
as well as alternative base pairs involving the sugar edges of dG8 and 
dG21, the Watson-Crick edges of dG8, dG21 and dA23, and the 
Hoogsteen edges of dA10, dG22 and dG26. Moreover, the compact 
DNA architecture is stabilized by stacking interactions of nucleotides 
in J1/2 and J3/2, which place dT11 and dG24 sandwiched between 
the non-canonical base pairs dG8:dA10 and dG21:dA23. These inter- 
actions are consistent with previous atomic mutagenesis data, which 
showed that the Watson-Crick edges of dG8, dT14, dG21 and dG25 
are essential for the functional integrity of 9DB1 (ref. 21). Moreover, 
dG26 did not tolerate changes on the Watson—Crick or Hoogsteen 
edges, consistent with its interactions in a base triple with the dC12 
and dG8 nucleotides (Fig. 2b). 

The reactive nucleotides A—1 and G1 are positioned in the active 
site by a scaffold formed by four nucleotides from distinct locations 
of the primary sequence, corroborated by the large angular opening 
between the duplexes P1 and P4 (Fig. 3a, b). Thus, the junction J2/3 
protrudes from the catalytic domain and extensively contacts A—1 and 
G1 by base pairing with dT30 and dT29, respectively, as well as by the 
stacking of dG27 on Gl (Fig. 3a, b). On the opposite side, a 130° kink in 
the backbone between dT14 and dC16 causes dA15 to form a stacking 
interaction with A—1 (Fig. 3b). In this way, A—1 and G1 are stacked on 
one another, sandwiched by dA15 and dG27 and engaged in hydrogen 
bonding with dT29 and dT30, resulting in a microenvironment rem- 
iniscent of a duplex. 
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Figure 3 | Positioning of reactive nucleotides in the catalytic domain. 
a, Structure at the ligation junction. b, Base-pairing and stacking 
interactions of the ligated nucleotides G1 and A — 1. Black dashes 
indicate hydrogen bonds. c, Biochemical interrogation of the base pair 
of nucleotides 1:29. Ligation rate or yield with original 9DB1 (dT29, 
black), and with mutant 9DB1 DNAs (green) complementary to the RNA 
substrates, that is, dT29dC for pppG1, dT29dG for pppC1, dT29dA for 
pppU1 RNA. d, Activity assays of 9DB1 with substrate analogues modified 
at the 2'-position of A-1. c, d, Single-turnover kinetic assays were 
performed at pH 7.5 with 20mM Mn?" at 37°C. All experiments were 
repeated two or three times. kops, observed rate constant, ky, relative rate 
constant with respect to A-1 substrate. 


The identification of the previously unknown base pairs (P5) at 
the ligation junction was intriguing, as dT29 and dT30 were shown 
to tolerate mutations”°, and RNA substrates with uridine or guano- 
sine at position —1 were also efficiently ligated'*">. In contrast, only 
5/-purine nucleotides were allowed at position 1 in the donor RNA 
(that is, 5/-pppG1 or 5'-pppA1)". Therefore, the crystal structure of 
9DB1 revealed an effective way of manipulating the catalytic activity 
by compensatory mutagenesis of the base pair at the ligation junction. 
Indeed, RNA substrates that were inert to the original deoxyribozyme, 
such as 5’-pppCl and 5/-pppU1 RNA, are now readily ligated with 
mutated 9DB1 enzymes bearing G and A, respectively, in position 29 
(Fig. 3c). On the basis of the crystal structure, 9DB1 deoxyribozymes 
can now be designed for ligation of any RNA substrates, irrespective 
of their sequence. 

The regioselectivity of the ligation reaction is determined by the rel- 
ative orientation of the reacting nucleotides. For ribozymes”””’ as well 
as deoxyribozymes”*, continuous Watson-Crick base pairing across 
the ligation junction favours nucleophilic attack by the 3’-OH group 
onto the 5’-triphosphate of G1. The regioselectivity of 9DB1 is dictated 
by tertiary contacts and specific interactions at the ligation junction, 
which differ markedly from a continuous double-helical arrange- 
ment. Thus, the orientation of the donor nucleotide G1 is assisted by a 
hydrogen bond of its 2’-OH to the minor groove of dC12:dG26 in P2 
(Fig. 4a). The arrangement of the binding arms induces a shorter dis- 
tance between the phosphate groups of A—1 and G1 (5.3 A as compared 
to 5.9A ina regular A-form duplex). The hydrogen bond between the 
2'-OH group of A—1 and the 04’ (or 05’) of G1 may sequester the 
2'-OH and thereby assist the regioselective formation of 3/5’ ligated 
products (Fig. 3b). Consistently, removal or blocking of the 2’-OH 
group by introducing a 2’-deoxy or 2'-OCH; nucleotide at position 
A~—1 prevented efficient ligation (Fig. 3d). Instead, replacement of the 
2'-OH with 2'-NH; still allowed RNA ligation, although with a 30-fold 
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slower rate (Extended Data Fig. 2). Interestingly, 2’-fluoro-modified 
RNA was ligated only 6-7-fold slower (Fig. 3d and Extended Data 
Fig. 2), suggesting that the electronegativity of fluorine and the result- 
ing 3’-endo conformation of the ribose compensates for the loss of the 
2'-OH group and orients the 3’-OH for nucleophilic attack onto the 
5/-phosphate. 

Catalysis of phosphoryl-transfer reactions by proteins or ribozymes 
often involves activation of nucleophilic groups and electrostatic stabili- 
zation of the transition state by divalent metal ions or specific chemical 
groups”. The active site of an RNA enzyme that catalyses the same 
reaction as the DNA enzyme 9DB1, is composed of two phosphate 
groups coordinating a catalytic metal ion, plus an amino group from 
a nucleobase and a 2’-OH group that exerts its action through inter- 
mediacy ofa water molecule”’. There are no equivalent interactions in 
the core of 9DB1, and we do not observe electron density for a cata- 
lytic metal ion. However, a striking feature is the phosphate group of 
nucleotide dA 13, which lies within 3.1 A distance from the ligation 
junction (Fig. 4a). To test the role of the dA13 phosphate in catalysis, we 
replaced either of the non-bridging oxygen atoms with a sulfur atom, 
and assessed the enzymatic activity of the resulting phosphorothio- 
ate-modified DNAs. In the presence of manganese ions, the ligation 
rate of the deoxyribozyme containing an Sp phosphorothioate was 
reduced 100-fold, whereas the Rp stereoisomer retained its full activity 
(Fig. 4b). A similar phosphorothioate interference effect was observed 
when magnesium was used as metal ion cofactor, and the catalytic 
activity could also not be rescued by the addition of cadmium ions. 
The absence of thiophilic metal rescue cannot prove whether metal ions 
are involved or not in catalysis. Upon substitution of the non-bridging 
phosphate oxygens with a methyl group, only one of the two resulting 
uncharged methylphosphonate diastereomers retained catalytic activ- 
ity (Fig. 4b). Although the role of metal ions remains unclear, these 
data are consistent with the critical role of the non-bridging pro-Sp 
phosphate oxygen of dA13 in activating the 3’-OH group to promote 
RNA ligation. 

This work provides the experimental evidence that DNA possesses 
the ability to fold into compact three-dimensional units, stabilized 
by intrinsic tertiary interactions in the absence of 2'-OH groups or 
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artificial hydrophobic substituents”*. In contrast with ribozymes, the 
sugar-phosphate backbone of the catalytic domain of 9DB1 exhibits 
much larger conformational diversity, as reflected by the broad distri- 
bution of pseudorotation phase angles found in the DNA nucleotides 
(Extended Data Fig. 3). Thus, of the 31 nucleotides in the catalytic 
domain, 8 and 20 adopt north (N)- and south (S)-type ribose puckers, 
respectively (Fig. 1d). The remaining three nucleotides lie outside typ- 
ical N or S conformations. The less restrictive puckering allows DNA 
to explore a wide range of conformations, compensating for the lack 
of the 2’-hydroxyl group, which has an important structural role in 
ribozymes. 

Comparison of 9DB1 with two RNA ligase ribozymes illustrates how 
the fold of DNA and RNA enzymes underlies the recognition and posi- 
tioning of the substrates in the catalytic centre (Extended Data Fig. 4). 
In the class I ligase and L1 ligase ribozymes, the reactive nucleotides are 
positioned in extended helices (Extended Data Fig. 5)??°. In contrast, 
the four contributing bases in the 9DB1 ligase are provided in trans 
from different primary locations of the catalytic domain. Notably, all 
four DNA nucleotides adopt S-type ribose puckers, highlighting how 
the specific conformational propensity of deoxyribose moieties sup- 
ports recognition and positioning of the substrate in the active centre. 

The crystal structure of the 9DB1 deoxyribozyme demonstrates that 
DNA possesses the intrinsic ability to adopt complex tertiary folds that 
support catalysis, and unveils for the first time the active site ofa DNA 
enzyme in the post-catalytic state. Together with mutagenic analyses, 
the structure elucidates the basis of substrate recognition and provides 
first insights for rationalization of the regiospecific bond formation. 
The ability of DNA to form complex three-dimensional architectures 
that support catalysis raises questions about the positioning of this 
biopolymer in the prebiotic evolution of life and may shed light on 
metabolic events in current organisms, in which single-stranded DNA 
may adopt functionally important folds. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Oligonucleotide synthesis and complex formation. RNA and DNA oligonu- 
cleotides were prepared by solid-phase synthesis using phosphoramidite chem- 
istry on polystyrene or controlled-pore glass solid supports. For RNA synthesis 
2'-O-TOM-protected RNA phosphoramidites were used*!. 3’-Modified RNA was 
prepared on commercially available modified (2’-OCHs, 2’-NHb, 2’-F-uridine) 
solid supports or universal support (2’-F-adenosine). All oligonucleotides were 
deprotected with ammonia or methyl amine, followed by tetrabutylammonium 
fluoride (for RNA only), desalted and purified by denaturing polyacrylamide 
gel electrophoresis. The phosphorothioate modification was introduced during 
solid-phase synthesis using phenylacetyldisulfide**. Methylphosphonate modifi- 
cations were introduced using commercially available N*-acetyl-dC methylphos- 
phonamidite. Phosphorothioate and methylphosphonate diastereomers were 
separated by RP-C18 high-performance liquid chromatography (HPLC; NHsOAc/ 
CHCN gradient) and assigned according to elution order as reported****. Full- 
length deoxyribozymes were generated by splinted ligation of two fragments. 
5'-Triphosphorylated donor RNA with 5’-purine nucleotides were prepared by 
in vitro transcription with T7 RNA polymerase. Synthetic triphosphorylated RNAs 
were prepared by solid-phase synthesis following published procedures*** or pro- 
vided by J. Ludwig. Crystallization complexes were formed by mixing RNA and 
DNA strands in a 1:1 molar ratio, heated to 95 °C, and slowly cooled down to room 
temperature before MgCl, was added. The final concentration of the complex was 
0.8mM in 10mM HEPES pH 8.0, 150mM NaCl, 2mM KCl and 20mM MgCh. 
Crystallization and diffraction experiments. Crystals were grown at 4°C from 
solutions containing 35% 2-methyl-2,5-pentanediol (MPD), 0.5mM spermine, 
20mM MgCh, 100mM NaCl and 100mM sodium cacodylate (pH 6.5) using the 
sitting-drop vapour-diffusion method. Sitting drops with volumes of 211 produced 
crystals after 5-10 days, reaching full size after 30 days. The derivatized crystals 
were prepared by soaking of the crystals in a solution containing the mother liquor 
and hexammine cobalt (II) chloride for 1 min. 

Prior to data collection, crystals were cryoprotected with perfluoropolyether 
(PFPE) and flash frozen. The crystallographic experiments were performed on 
the PXII beamline at the Swiss Light Source, Paul Scherrer Institut, Villigen, 
Switzerland. The data sets were processed with XDS*. 

Crystallographic analyses. The structure of the complex was determined by single 
anomalous dispersion (SAD) using a hexammine cobalt (III) chloride derivative. 
The determination of the heavy atom substructure was done with SHELXD*. 
Initial SAD phase calculation and automatic model building were carried out 
with AutoSol wizard’. The resulting model from AutoSol was manually edited 
with COOT”. The edited partial model was used along with the experimentally 
determined cobalt sites for MR-SAD phase calculation within Phaser*!. The 
resulting map was solvent flattened using RESOLVE™. This map was used for 
tracing the entire phosphodiester backbone. The RNA strand of the model was 
built using Rcrane** and the DNA strand was built from individual nucleotide 
monophosphates. Structure refinement was carried out with phenix.refine™. 
Analysis of pseudorotation phase angles was performed on the PROSIT webserver 
(http://cactus.nci.nih.gov/prosit/)*. 

Kinetic assays of DNA-catalysed RNA ligation reactions. Single-turnover assays 
were performed as described previously!*". Briefly, 5/-**P-trace-labelled acceptor 


LETTER 


RNA (3 pmol), 9DB1 deoxyribozyme, and 5’-triphosphorylated donor RNA were 
mixed ina 1:5:10 ratio and annealed by heating to 95°C for 2 min, followed by slowly 
cooling for 15 min to room temperature (for RNA/DNA sequences see Extended 
Data Table 2). The assays were performed in 10 11 volume at 37°C in buffer con- 
taining 50 mM HEPES, pH 7.5, 150mM NaCl, 2mM KCl and 20mM MnCh, or 
50mM Tris buffer, pH 8.5, 150mM NaCl, 2mM KCl and 80mM MgCl) including 
0, 0.1, 1, or 1OmMM CdCh. Aliquots of 1,11 were removed at desired time points 
and quenched in stop solution (80% formamide, 1 x TBE, 50mM EDTA, 0.025% 
bromophenol blue and 0.025% xylene cyanol). The samples were analysed by PAGE 
(15% polyacrylamide), and band intensities were quantified by Phosphorimaging. 
The yield versus time data were fit to (fraction ligated) = Y(1 — et), where k= kos 
and Y= final yield. No statistical methods were used to predetermine sample size. 
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Extended Data Figure 1 | Semi-continuous helix in the crystal lattice. Nucleotides important for the crystal contacts are labelled accordingly. 
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Extended Data Figure 2 | Influence of 2’-modifications on mutant acceptor RNA U-1. 
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Extended Data Figure 4 | Active sites of RNA-ligating nucleic acid enzymes. Left: deoxyribozyme 9DB1; middle: class I ribozyme; right: L1 ligase 
ribozyme. Figure was generated using Protein Data Bank (PDB) accession numbers 5CKK (9DB1), 3RI1L (class I) and 2OIU (L1). 
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Extended Data Figure 5 | Positioning of the reactive nucleotides black. Nucleotides at the ligation junction are coloured in cyan, stacking 
in the active centre of ligase deoxyribozyme 9DB1 in comparison nucleotides are depicted in light red. Nucleotides shown as white-filled 
to ribozymes. A schematic of the three enzymes is shown before and circles are indicated for orientation purposes. Recognition between 
after the crystal structure determination (left and central columns, reactive and pairing nucleotides is shown as dashed lines (left column). 
respectively). Strands that were deliberately chosen are depicted in red Structures of product-bound nucleic acid enzymes are compared (PDB 


and blue, while the ones that resulted from random selection are shown in accessions 5CKK (9DB1), 3HHN (class I) and 2OIU (L1), right column). 
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Extended Data Table 1 | Data collection and refinement statistics 


Data collection 
Space group 
Cell dimensions 
a, b, c (A) 
a, B, 7 (°) 
Wavelength 


Resolution (A) 


Rmerge (%) 

Iol 

CC12 
Completeness (%) 


Redundancy 


Refinement 
Resolution (A) 
No. reflections 
Rwork! Riree 
No. atoms 
Nucleic acid 
Ligand/ion 
B-factors 
Nucleic acid 
Ligand/ion 
R.m.s deviations 
Bond lengths (A) 


Bond angles (°) 


Native 


P 43 212 


83.1, 83.1, 55.9 
90.0, 90.0, 90.0 
1.0 


41.58 — 2.80 
(2.90 — 2.80) 


5 (>100) 
33.87 (0.43) 
0.99 (0.90) 
97 (97) 


24.8 (25.8) 


2.80 
5012 


0.27/ 0.29 


1232 


One crystal was used for each data set. Highest-resolution shell values are shown in parentheses. 


Co-Hex 


P 43 212 


80.8, 80.8, 55.8 
90.0, 90.0, 90.0 
1.6 


40.42 — 2.98 
(3.09 — 2.98) 


8 (>100) 
25.23 (0.86) 
0.99 (0.52) 
100 (98) 


26.1 (24.0) 


2.98 
4053 


0.25/ 0.31 


1232 


0.004 


0.60 
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Extended Data Table 2 | Oligonucleotides for kinetic experiments 


description 


acceptor RNA 3’-A 

acceptor RNA 3’-(2’dA) 
acceptor RNA 3’-(2-OMeA) 
acceptor RNA 3’-(2’-F-A) 
acceptor RNA 3’-U 

acceptor RNA 3’-(2’-OMe-U) 
acceptor RNA 3’-( 2’-NH2-U) 
acceptor RNA 3’-(2’-F-U) 
donor RNA1 5’pppG 
donor RNA2 5’pppG 

donor RNA 5’pppA 

donor RNA 5’pppC 

donor RNA 5’pppU 
9DB1 DNA original* 

9DB1 DNA*,t 

9DB1 dT29C 

9DB1 dT29A 

9DB1 dT29G 

9DB1 DNA (Fig. 4b) 

9DB1 DNA (ps) (Fig. 4b) 
9DB1 DNA (pCH3) (Fig. 4b) 


splint for ligation 


*Catalytic core is underlined. 
tdT29 is in bold. 
tLigation site is indicated with |. 


5’-sequence-3’ 


GGAAGU 
GGAAGU 
GGAAGU 
GGAAGU 
GGAAGU 
GGAAGU 
GGAAGU 
GGAAGU 
pppGAU 
pppGACGC 
PppAACGC 
pppCACGC 


PPPUACGC 


CUCAUGUACUA 


CUCAUGUACU (dA) 


CUCAUGUACU (OMeA) 


CUCAUGUACU (FA) 


CUCAUGUACUU 


CUCAUGUACU (OMeU) 


CUCAUGUACU (NH2U) 


CUCAUGUACU (FU) 


GUUCUAGCGCCGGA 


UGACCCUGAAGUUCAUCUU 


UGACCCUGAAGUUCAUCUU 


UGACCCUGAAGUUCAUCUU 


UGACCCUGAAGUUCAUCUU 


CAAGGCGC TAGAACATGGATCATACGGTCGGAGGGGTTTGCCGTT' 


GAACT 


GAACT 


GAACT 


GAACT 


GCGCT 


GCGCT 


GCGCT 


CCCCT 


TAAGTACATGAGACTTCC 


['TCAGGGTCAGCGTGGATCAT 


['TCAGGGTCAGCGTGGATCAT 


['TCAGGGTCAGCGTGGATCAT 


['TCAGGGTCAGCGTGGATCAT 


[TAGAACATGGATC (pCH3) A‘ 


[CCGACCGTATGATCCATGT' 


TACGGTCGGAGGGGTT’ 


TACGGTCGGAGGGGTC! 


TACGGTCGGAGGGGTAT 


TACGGTCGGAGGGGTG! 


TAGAACATGGATCATACGGTCGGAGGGGTTTGCCGT' 


[Cc 


[GCCG 


['GCCG1 


['GCCG1 


['GCCGT 


['TAAG 
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['TTAAGTACATGAGACTTCC 


['TTAAGTACATGAGACTTCC 


['TTAAGTACATGAGACTTCC 


['TTAAGTACATGAGACTTCC 


TACATGAGAC 


TAGAACATGGATCpsATA | CGGTCGGAGGGGTTTGCCGTTTAAGTACATGAGAC 


[A | CGGTCGGAGGGGTTTGCCGT TTAAGTACATGAGAC 
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Structures of two distinct conformations of 
holo-non-ribosomal peptide synthetases 


Eric J. Drake!?*, Bradley R. Miller!*, Ce Shi’, Jeffrey T. Tarrasch‘, Jesse A. Sundlov!?, C. Leigh Allen!, Georgios Skiniotis*, 


Courtney C. Aldrich? & Andrew M. Gulick! 


Many important natural products are produced by multidomain 
non-ribosomal peptide synthetases (NRPSs)!~+. During synthesis, 
intermediates are covalently bound to integrated carrier domains 
and transported to neighbouring catalytic domains in an assembly 
line fashion®. Understanding the structural basis for catalysis with 
non-ribosomal peptide synthetases will facilitate bioengineering 
to create novel products. Here we describe the structures of two 
different holo-non-ribosomal peptide synthetase modules, each 
revealing a distinct step in the catalytic cycle. One structure 
depicts the carrier domain cofactor bound to the peptide bond- 
forming condensation domain, whereas a second structure 
captures the installation of the amino acid onto the cofactor within 
the adenylation domain. These structures demonstrate that a 
conformational change within the adenylation domain guides 
transfer of intermediates between domains. Furthermore, one 
structure shows that the condensation and adenylation domains 
simultaneously adopt their catalytic conformations, increasing 
the overall efficiency in a revised structural cycle. These structures 
and the single-particle electron microscopy analysis demonstrate 
a highly dynamic domain architecture and provide the foundation 
for understanding the structural mechanisms that could enable 
engineering of novel non-ribosomal peptide synthetases. 

A non-ribosomal peptide synthetase (NRPS) module incorporates 
a single residue into a peptide natural product. Each module contains 
a peptidyl carrier protein (PCP) that is post-translationally modified 
with a phosphopantetheine cofactor®, an adenylation domain that 
loads the amino-acid substrate onto the PCP cofactor, and a condensa- 
tion domain that catalyses peptide bond formation. NRPSs then use a 
carboxy (C)-terminal thioesterase or reductase domain to catalyse prod- 
uct release. Structures of individual domains! provide insight into the 
NRPS structural mechanism. Interestingly, the adenylation domains 
have been shown to adopt two catalytic conformations’. First the 
adenylate-forming conformation activates the amino-acid sub- 
strate using ATP to form an aminoacyl adenylate and pyrophos- 
phate. A C-terminal subdomain then rotates by ~140° to form the 
thioester-forming conformation that is used to install the amino acid 
onto the PCP’. These two functional states have been observed in struc- 
tures of the phenylalanine activating adenylation domain of gramici- 
din synthetase® and the complexes between adenylation and carrier 
proteins obtained with mechanism-based inhibitors?!°. Once loaded, 
both the pantetheine and loaded substrate have been shown to inter- 
act transiently with the core of the carrier protein'’*. The structure 
of SrfA-C, the terminal module from surfactin biosynthesis, contains 
a condensation—adenylation-PCP-thioesterase architecture and is to 
date the only structure of an intact NRPS module’. The condensation 
and adenylation domains share an extensive interface and were pro- 
posed to form the core of the module'*. Lacking the pantetheine mod- 
ification, this apo-structure shows the PCP domain directed towards 
the condensation domain. The other active sites are 40-60 A from the 


pantetheinylation site, indicating that extensive domain rearrangements 
are required to complete the NRPS catalytic cycle. Movement of the PCP 
domain, potentially coupled to the adenylation C-terminal subdomain 
rotation’, is necessary for delivery of the peptide intermediates to the 
different catalytic domains. 

We determined structures of two NRPSs with the same architec- 
ture as SrfA-C (Extended Data Fig. 1), but with holo-proteins that 
show functional interactions between the PCP and catalytic domains 
(Fig. 1). First we present two structures of AB3403 from the human 
pathogen Acinetobacter baumannii (protein annotation ABBFA_003403 
in strain AB307-0294) that belongs to an uncharacterized biosynthetic 
pathway implicated in motility’*, and biofilm! and pellicle!® formation. 
We describe the structures of holo-AB3403 obtained without ligands 
and also upon crystallization in the presence of Mg-ATP and glycine, 
which among the proteinogenic amino acids serves as the best sub- 
strate (Extended Data Fig. 2). Second, we present the structure of EntF 
from Escherichia coli, showing the PCP cofactor covalently trapped 
with a mechanism-based inhibitor to model thioester formation within 
the adenylation domain. These results provide views of two distinct 
steps in the NRPS catalytic cycle and demonstrate how the domain 
rotation within the adenylation domain mediates the delivery of the 
PCP between the two catalytic domains. 

The structures of AB3403 were determined at 2.7 and 2.9 A reso- 
lution (Extended Data Table 1). No prior structure exists of an NRPS 
condensation domain bound to a ligand; the holo-AB3403 protein 
shows the pantetheine cofactor residing in the active site (Fig. 2 and 
Extended Data Fig. 3a). The two lobes of the condensation domain 
adopt the closed orientation seen recently in the CDA synthetase 
condensation domain!’. Contacts are made between the pantetheine 
and the helix running from Glu20 to Leu39, in particular Tyr26 and 
Ile27, which forms one wall of the tunnel through which the panteth- 
eine approaches the active site (Fig. 2b). Additionally, Tyr37 forms 
a hydrogen bond with the amide of the cysteamine moiety of the 
pantetheine cofactor. As the main chain carbonyl of Tyr37 hydrogen 
bonds to the main chain amide of the catalytic His145, this is a critical 
interaction to close the two lobes and bring the active histidine into 
proper position. 

Holo-AB3403 therefore illustrates the conformation that is adopted to 
properly deliver the pantetheine of the PCP to the condensation domain. 
The PCP is rotated ~30° relative to the orientation of the PCP domain of 
SrfA-C (Extended Data Fig. 4). The AB3403 PCP interface with the con- 
densation domain is composed of residues from helix «2, the helix that 
follows the pantetheinylation site at Ser1006, and the loops that precede 
and follow this helix. In particular, residues Phe999 to Tyr1032 face 
the condensation domain. Leu1007 and Val1010 form a hydrophobic 
interaction with Leu22 and Ile80 of the condensation domain. The side 
chain of Lys1011 forms a hydrogen bond with the main chain carbonyl 
of Gln78. Finally, Val1026, Ala1027, and Ala1030 on the PCP helix a3 
form a hydrophobic interaction with Tyr26 and Leu30. Arg344 of the 
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Figure 1 | Ribbon diagrams of complete NRPS modules. a, Domain 
architecture of three structurally characterized termination modules. 
b-d, The protein structures of (b) AB3403, (c) EntK, and (d) SrfA-C are 
coloured with domains coloured white (condensation), pink and red 
(adenylation domain N- and C-terminal subdomains), green-cyan (PCP), 
and blue (thioesterase). The phosphopantetheine moieties of AB3403 and 
EntE, and inhibitor Ser-AVS, are highlighted. 


condensation domain, which is positioned on an insertion compared 
with SrfA-C, interacts with the phosphate from the cofactor. 

The AB3403 adenylation domain (Fig. 2c) is precisely positioned in 
the adenylate-forming conformation, unlike the adenylation domain of 
SrfA-C, which is in an open conformation that may be used for substrate 
binding or release°. The lysine of the conserved catalytic A10 motif”'® 
interacts with a phosphate oxygen from AMP and a carboxylate oxygen 
with glycine and superimposes with the homologous lysine in the grami- 
cidin synthetase domain. In SrfA-C, the homologous lysine is ~12 A away. 

The thioesterase domain of AB3403 is structurally similar to the 
homologous domains of both SrfA-C and EntF (Extended Data Fig. 5), 
the latter of which has been characterized by NMR and crystallography 
in complex with the upstream PCP domain)”. Despite the similarities 
in domain structure, the thioesterase domain of AB3403 is in a mark- 
edly different location compared with SrfA-C (Fig. 3a). Interestingly, 
in this new position the thioesterase domain cradles the back face of the 
PCP domain. The thioesterase domains of SrfA-C or AB3403 do not 
make substantial contacts with the other catalytic domains. 
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Figure 2 | NRPS domain structures. a, The condensation domain of 
AB3403 (white) was aligned with SrfA-C (yellow) and EntF (orange) on 
the basis of the condensation C-terminal subdomain. The AB3403 PCP 
is included. b, The AB3403 condensation domain highlights residues that 
form the hydrophobic tunnel through which the pantetheine passes. 

c, Superposition of adenylation domains of AB3403 (pink and maroon 
for N- and C-terminal subdomains), SrfA-C (yellow) and gramicidin 
synthetase, GrsA (cyan), with phenylalanine and AMP molecules of 
GrsA. The dotted line highlights the alternative position of the catalytic 
lysines of AB3403 and SrfA-C. d, The EntF adenylation domain active site 
shows a covalent linkage from the pantetheine to the Ser-AVS inhibitor. 
e-g, Electron density calculated with coefficients of the form F, — F. 
generated before inclusion of ligands and contoured at 30, are shown 

for the (e) AB3403 condensation, (f) AB3403 adenylation, and (g) EntF 
adenylation domains. 


We next examined the delivery of the holo-PCP to the adenylation 
domain in a different NRPS protein. We have previously used targeted 
mechanism-based inhibitors, harbouring a vinylsulfonamide moiety 
that traps the thioester-forming reaction”! to characterize functional 
adenylation-PCP di-domain interactions®!”. These inhibitors mimic 
the native aminoacyl adenylate, but contain a Michael acceptor posi- 
tioned to react with the pantetheine thiol. EntF crystallized only in the 
presence of the serine adenosine vinylsulfonamide (Ser-AVS) inhibitor 
(Fig. 2d and Extended Data Fig. 6) that limits conformational flexibility 
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Figure 3 | Conformational dynamics in NRPS modules. a, Alternative 
locations of the thioesterase domain SrfA-C and AB3403. b, Representative 
electron microscopy class averages of EntF. The smaller thioesterase (TE) 
domain is observed in various positions relative to the condensation 
(C)-adenylation (A) di-domain. Overall EntF adopts a variety of extended 
(top) to compact (bottom) conformations. c, The interface between the 
condensation C-terminal subdomain and the adenylation domain is shown 
for SrfA-C, AB3403, and EntF. The adenylation surface is shown in white, 
highlighting in red the regions that interact with the condensation domain. 
The right panel shows this interface, rotated by 90° around the y axis, with 
the condensation domain omitted for clarity. 


to promote crystallization. Crystals of the EntF protein diffract to 2.8 A 
(Extended Data Table 2). No electron density was observed for the 
thioesterase domain although the intact protein was present in the 
crystal lattice (Extended Data Fig. 7). 

The condensation domain of EntF is similar to the closed AB3403 
conformation (Fig. 2a). The adenylation domain adopts the catalytic 
thioester-forming conformation of prior adenylation-PCP proteins”””, 
demonstrating that the conformation is compatible with a full NRPS 
module. The active site of the EntF adenylation domain identifies 
conserved residues (Fig. 2d) that have been shown to play important 
catalytic roles in other members of this enzyme superfamily’. Arg863 
interacts with the cofactor phosphate, while Gly864 and Gln865 form 
one wall of the pantetheine tunnel. Interactions with the nucleotide 
occur between Asp840 and the ribose hydroxyls, and between Tyr746 
and Tyr852 and the adenine ring. The inhibitor serine binds in the 
binding pocket formed by Asp648, Ser722, and Asp754 (Fig. 2d). 

The lack of density for the thioesterase domain in EntF suggested 
multiple conformations in the crystal lattice. This is not surprising 
given the limited interactions in SrfA-C and AB3403 between the 
thioesterase domains and the other domains. To assess thioesterase 
conformational mobility, we examined EntF by negative-stain electron 
microscopy followed by classification and averaging of single-particle 
projections (Extended Data Fig. 8). The class averages revealed pri- 
marily a tri-lobed density with two neighbouring globular densities of 
similar size attributed to the condensation and adenylation domains 
anda smaller lobe attributed to the thioesterase domain (Fig. 3b). The 
positioning of the thioesterase domain assumes a surprisingly wide 
range of distances and angles relative to the other domains. 

The large interface of the SrfA-C condensation and adenylation 
domains! suggested they constitute a catalytic platform, upon which 
the other domains move. We therefore compared the interfaces of the 
three NRPS modules (Fig. 3c). The interface in AB3403 is 1,023 A, 
comparable in size to the 1,097 A? interface of SrfA-C. In contrast, the 
interface in EntF is only 780 A?, resulting from the rotation of the ade- 
nylation C-terminal subdomain to the thioester-forming conformation. 

Additionally, the conformation of the interface is not conserved 
between all three proteins. Alignment of the structures on the basis of 
the amino (N)-terminal subdomains of the adenylation domain shows 
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that the condensation domain of both AB3403 and EntF differ slightly 
from each other and more significantly from SrfA-C. In AB3403 and 
EntE, the condensation domains are rotated by ~25° relative to the 
adenylation domains. Furthermore, the EntF condensation domain is 
shifted closer towards the adenylation domain. Structural comparisons 
suggest that this alternative conformation in EntF may not be compati- 
ble with the adenylate-forming conformation. The three different con- 
densation-adenylation domain conformations, the adenylate-forming 
incompatibility seen in EntF, and the multiple extended and compact 
conformations seen in the electron microscopy data suggest that the 
condensation-adenylation domain platform may be more dynamic 
than previously proposed’. 

The new structures confirm the hypothesis’ that the adenylation 
domain conformational change is a structural mechanism to guide 
the PCP between active sites in the context of complete NRPS mod- 
ules. The rotation of the adenylation domain C-terminal subdomain 
from the adenylate-forming conformation in AB3403 to the thioester- 
forming conformation of EntF delivers the PCP into the adenylation 
domain for loading. The recent structure of loaded holo-PCP has 
shown the interaction of the substrate with the PCP core which may 
help to promote release of the substrate from the adenylation domain". 
This interaction also alters the surface electrostatic potential of regions 
that interact with the neighbouring catalytic domains, including «2 
and «3, and may influence the PCP delivery to neighbouring catalytic 
domains. Finally, this transfer is further assisted by the linker region 
that joins the adenylation C-terminal subdomain with the PCP domain, 
which includes important contacts that are preserved in the adenylate- 
and thioester-forming conformations”, as well as the open conforma- 
tion of SrfA-C. 

The basic NRPS catalytic cycle requires that the PCP visits three 
adjacent catalytic domains in a coordinated manner. The two cata- 
lytic conformations of the adenylation domain’ require that the full 
cycle has four catalytic structural states (Fig. 4). Specifically, (I) the 
adenylation domain catalyses amino-acid adenylation, (II) the PCP is 


Thioesterase 


Condensation 


Adenylation 


Figure 4 | Dynamics of the NRPS cycle. The four-stage catalytic cycle of 
an NRPS module. The pantetheine cofactor is represented by the wavy 

line with a terminal thiol, SH. The aminoacyl adenylate intermediate is 
represented by AA-AMP. The thioester between the amino acid and the 
cofactor is shown as S-AA. Finally, the peptide bound to the upstream 
carrier protein (purple) is abbreviated Pep. Following the condensation 
reaction, the peptide is extended by one amino acid (Pep + 1) and 
presented to the thioesterase domain. The revised NRPS structural cycle is 
highlighted in yellow showing that only three structural states are required. 
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delivered to the adenylation domain for thioester-formation to load 
the PCP, (III) the PCP is delivered to the condensation domain to 
receive the upstream peptide, and finally (IV) the peptide is delivered 
to a downstream condensation, thioesterase, or reductase domain for 
release. 

Our results show that states I and III are identical and only three 
distinct conformations are required to accommodate the four catalytic 
states of the NRPS cycle (Fig. 4, yellow). The protein first adopts an 
adenylate-forming conformation, seen in AB3403, state III, to catalyse 
amino-acid adenylation. Through the domain rotation of the adenyl- 
ation C-terminal subdomain, the PCP is delivered to the adenylation 
domain to load the pantetheine cofactor, as seen in the crystal structure 
of Ent, state II. Return of the PCP to the condensation domain delivers 
the loaded PCP for receipt of the upstream peptide, state III. Critically, 
as seen in AB3403, the adenylation domain can activate a second amino 
acid to prime the system for another cycle. The ability to simultane- 
ously catalyse peptide bond formation and amino-acid adenylation at 
two active sites significantly increases the overall catalytic efficiency 
and throughput of the NRPS module. Finally, although no structure 
exists of a full NRPS module with the PCP directed into the thioester- 
ase or other downstream domain in state IV, the structure of AB3403 
also offers a new view of the thioesterase domain and suggests the 
peptide-loaded PCP could be delivered to the downstream thioesterase 
domain through a simple rotation. 

The modular architecture of NRPSs as well as their capacity to cat- 
alyse unusual chemistry”? offer the potential for generating novel 
products through engineering enzyme activity and the combination of 
heterologous domains. These efforts have been limited by deficiencies 
in our understanding of the functional interactions between domains 
and within active sites. The new views of two essential catalytic states 
in the NRPS cycle, an appreciation of the greater dynamics of NRPS 
systems, and the structures of holo-NRPS proteins with relevant lig- 
ands will provide the necessary insights to guide these engineering 
efforts. In addition, these studies complement the recent visualization 
of modular polyketide synthases by cryo-electron microscopy’® to set 
the stage for investigations of the structural foundation of even larger, 
multi-modular biosynthetic proteins. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


No statistical methods were used to predetermine sample size. The experiments 
were not randomized. The investigators were not blinded to allocation during 
experiments and outcome assessment. 

Expression, purification, and crystallization of AB3403. The human pathogen 
A. baumannii contains an uncharacterized NRPS cluster that has been implicated 
in motility and biofilm formation; the product of this operon is unknown. This 
operon contains eight genes. In strain AB307-0294 (ref. 26), from which the NRPS 
gene was cloned, this operon consists of genes ABBFA_003399 to ABBFA_003406. 
In the more commonly used ATCC17978 strain, the same genes are encoded by 
A1S_0119 to A1S_0112. The ABBFA_003403 (designated AB3403 herein) protein 
sequence is available at GenBank under accession number ACJ56070.1. 

The gene encoding AB3403 was PCR-amplified from AB307-0294 genomic 
DNA”® (courtesy of T. A. Russo). The amplified fragment was cloned into the 
pET15b-TEV expression vector” and confirmed by DNA sequencing. The vector 
provides a Hiss-tag, linker, and tobacco etch virus (TEV) protease recognition site 
that, upon treatment with TEV protease, yields a final recombinant product with 
glycine and histidine preceding the initial methionine residue. 

The AB3403 pET15b-TEV construct was transformed into E. coli (BL21-DE3) 
cells. Transformed cells were grown in LB media to an absorbance at 600 nm 
(Aeoonm) of 0.6 at 37°C. Protein expression was induced by addition of 0.5mM 
isopropyl--p-thiogalactoside (IPTG) and cells were incubated overnight at 16°C. 
Cells were harvested by centrifugation, flash-frozen in liquid nitrogen, and stored 
at —80°C. Selenomethionine-labelled protein was generated in M9 minimal media 
using a metabolic inhibition protocol”®. All purification steps were identical to the 
native protein. 

For purification, cells were resuspended in a buffer containing 50 mM HEPES 
(pH 7.5), 250mM NaCl, 10 mM imidazole, 0.2 mM TCEP. Cells were lysed by 
mechanical disruption (Branson Sonifier) and the resulting lysate was clari- 
fied by centrifugation at 235,000g for 45 min. The cell lysate was passed over a 
His-trap (GE Healthcare) immobilized metal ion affinity column and washed with 
lysis buffer containing 50 mM imidazole. Bound proteins were eluted with the 
same buffer containing 300 mM imidazole. The protein was incubated with TEV 
protease and dialysed against a TEV cleavage buffer (50 mM HEPES (pH 8.0), 
250mM NaCl, 0.2 mM TCEP, and 0.5mM EDTA) for 16h at 4°C. This partly 
purified protein was then phosphopantetheinylated by incubation with His-tagged 
non-specific phosphopantetheiny] transferase Sfp (10nM), 12.5mM MgCh, and 
1mM CoA for 60 min at 20°C. The clarified protein was then passed over the 
His-trap column a second time to remove uncleaved protein, the TEV protease, Sfp, 
and other contaminating proteins. The holo-AB3403 protein in the column flow- 
through was pooled, dialysed against a size exclusion buffer containing 50 mM 
HEPES (pH 7.5), 150mM NaCl, 0.2mM TCEP, and further purified by gel filtration 
(Superdex200). Protein concentration was assessed after dialysis against a crystal- 
lization buffer (25mM HEPES (pH 7.5), 50mM NaCl, 0.2mM TCEP) using an 
extinction coefficient at 280 nm of 157,570 M~! cm7!. 

Crystallization conditions for holo-AB3403 were initially identified from 

a sparse matrix screen at 20°C. Final crystals for native and SeMet-labelled 
holo-AB3403 were grown at 14°C by hanging-drop vapour diffusion against 
0.75-0.95 M potassium citrate, 0.01-0.025 M glycine, and 0.05 M bis-tris propane 
(BTP) (pH 8.0). Highest-quality native crystals were obtained using a protein con- 
centration of 5.5 mg ml! with a protein to cocktail ratio of 1.5:1. SeMet protein 
was crystallized in the same manner with a protein concentration of 7.5mg ml! 
and 1:1 protein to cocktail ratio. To obtain crystals in the presence of ligands, the 
protein was pre-incubated for 45 min at 4°C with 2mM MgCh, and 1.5-fold molar 
excess of ATP and glycine. 
Structure determination of AB3403. Crystals of holo-AB3403 were cryoprotected 
by stages using either ethylene glycol or potassium citrate for native and SeMet 
protein, respectively. The native protein crystals were cryo-protected with cock- 
tails containing 1.0 M potassium citrate, 0.3 M glycine, 0.05 M BTP (pH 8.0), and 
increasing (8, 16, and 24%) v/v ethylene glycol. The SeMet-labelled protein was 
cryo-protected with cocktails containing 0.3 M glycine, 0.05M BTP (pH 8.0) and 
increasing (1.0, 1.2, 1.4, and 1.6 M) potassium citrate. Crystals derived from protein 
co-crystallized with ligands included the same concentration of MgCl, ATP, and 
glycine in the cryo-protectant cocktails. 

Diffraction data were collected on APS beamline 23-IDB. The native data 
(2.7 A) were collected using a multi-crystal, multi-data set strategy using two 
crystals. A complete low-resolution scan was taken for one crystal followed by a 
higher-resolution scan of the best diffracting crystal. A high-resolution region of 
the second crystal was combined with the two scans from the first crystal. The opti- 
mal regions were identified with the JBLU-ICE software at the GM/CA beamline. 
A single peak wavelength data set (3.35 A) was collected for SeMet-labelled protein. 
The liganded protein data set was collected with a single crystal. 
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Diffraction data were indexed, merged, and scaled using iMOSFLM”’ in space 
group P4,2)2. Structure determination was performed with PHENIX™ using a 
combination of experimental single-wavelength anomalous diffraction (SAD) 
phasing and phased molecular replacement. A partial molecular replacement 
solution was positioned through PHASER with a sculpted (PHENIX sculptor) 
model derived from PheA (PDB accession number 1AMU)® and CytC1 (PDB 
3VNR). Using this partial molecular replacement model, the selenium sites were 
identified with the SAD data from SeMet-labelled crystals. An initial model was 
produced with PHENIX Autobuild that contained ~65% of the protein molecule, 
spread across multiple symmetry related molecules. This model was combined 
into a single protein chain, built and refined iteratively against native data using 
ARP-WARP?!, COOT™”, and PHENIX refine. 

The final refinements were performed with translation-libration-screw- 

rotation (TLS) parameterization** with groups consisting of residues 1:191, 
191:445, 446:480, 481:862, 863:959, 960:973, 974:1044, and 1054:1318, roughly 
defining the NRPS domain (or subdomain) boundaries. The protein is complete 
from residues Asn2 to Pro1319 with two small disordered loops in the adenyla- 
tion domain at Asn500-Asp501 and Gly627-Gly630. The latter loop is part of 
the conserved serine/threonine- and glycine-rich P-loop that is involved in bind- 
ing the triphosphate of the nucleotide’. Additionally, the condensation domain 
contains electron density for a diacylglycerol lipid molecule that co-purified with 
the protein and potentially derived from the bacterial membrane during cell dis- 
ruption. Diffraction and refinement statistics are presented in Extended Data 
Table 1. Experimental electron densities of the ligands of both structures are pre- 
sented in stereo format in Extended Data Fig. 3. 
Purification of EntF. The enterobactin biosynthetic cluster of E. coli has been used 
as a model system in many studies. The full-length EntF, containing the condensa- 
tion, adenylation, PCP, thioesterase domain architecture, loads serine onto the PCP 
domain. The condensation domain then recognizes the external carrier protein 
EntB that has been loaded with 2,3-dihydroxybenzoate (DHB) by the activity of the 
freestanding adenylation domain EntE. The DHB-serine amide is then transferred 
to the thioesterase domain while two additional cycles of synthesis complete the 
enterobactin trilactone. 

The EntF protein used in this study (GenBank P11454) was described 
previously?! The entf gene was PCR amplified from E. coli JM109 and cloned 
into a pET15-TEV vector with a N-terminal 5 x His-tag and a TEV protease cleav- 
age site”. The entf vector was transformed into E. coli (BL21-DE3) cells for pro- 
tein expression. Cells were grown in lysogeny broth (LB) media to A¢oonm= 0.6 at 
37°C before protein induction with 1 mM IPTG. Cells were grown overnight at 
16°C and collected by centrifugation. The cell pellets were flash frozen in liquid 
nitrogen. Selenomethionine-labelled EntF was expressed in M9 minimal media 
as described”*. 

For purification both of native and of SeMet-labelled protein, cells were resus- 
pended in lysis buffer containing 50 mM Tris-HCl pH 7.5, 400 mM NaCl, 0.2 mM 
TCEP, 10% glycerol, and 10 mM imidazole. Cells were lysed via sonication and 
centrifuged at 235,000g¢ for 45 min. Initial purification was achieved with a His- 
trap immobilized metal ion affinity column. Protein was eluted using lysis buffer 
with 300 mM imidazole. EntF was incubated with TEV protease overnight at 4°C 
in a cleavage buffer containing 50 mM Tris pH 7.5, 400 mM NaCl, 0.2 mM TCEP, 
10% glycerol, and 0.5 mM EDTA. Although expressed in E. coli, phosphopanteth- 
einylation was assured by the addition of 10nM Sfp, 1mM CoA, and 12.5mM 
MgClh. The reaction was incubated at room temperature (22 °C) for 1-2h. The 
holo-EntF was run over an immobilized metal ion affinity column once more to 
remove uncleaved protein along with Sfp. A final polishing step was performed 
with a Superdex 200 16/600 column in a final dialysis buffer containing 50 mM 
EPPS pH 8.0, 150mM NaCl, 0.2mM TCEP, 1mM MgCh, and 10% glycerol. Before 
crystallization, the Ser-AVS inhibitor was added at a concentration four times that 
of EntF and allowed to incubate for 2-4h at room temperature. 

For electron microscopy, native EntF was purified as above with the exception 
that a minimal dialysis buffer was used, which contained 50mM EPPS pH 8.0, 
100mM NaCl, and 0.2mM TCEP. No inhibitor was added. 

Crystal conditions for the Ser-AVS inhibited EntF were first identified using the 
Hauptman-Woodward high-throughput screen». Large diffraction-quality native 
and SeMet crystals were grown using hanging drop vapour diffusion at 20°C. 
A crystallization cocktail, consisting of 100 mM BTP pH 7.5, 125-150mM MgCh, 
and 22-28% PEG 4000, was diluted 1:1 with the final dialysis buffer. The hanging 
drops then combined protein at 30mg ml! and the undiluted crystallization cock- 
tail at a ratio of 1:2. This ‘batch mimic limited the differences between the drop and 
reservoir and has been successful with other protein samples in our laboratory*®. 
Structure determination of EntF. Native EntF crystals were cryoprotected by 
that addition of 2,3-butanediol directly to the crystallization drop to a final con- 
centration of ~10%. SeMet crystals were cryoprotected similarly except with 
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glycerol to a final concentration of ~20%. Diffraction data were collected on APS 
beamline 23-IDB using the rastering option to find the optimal spots on both the 
native the SeMet crystals. Diffraction data were indexed, merged, and scaled using 
iMOSFLM” in space group P4,2;2. Structure determination for the SeMet inflec- 
tion data was performed in PHENIX™ using a PhaserEP MR-SAD with a partial 
molecular replacement solution that was obtained using a sculpted model (gener- 
ated with PHENIX sculptor) derived from the Pseudomonas aeruginosa bidomain 
adenylation-PCP protein PA1221 (PDB 4DG9)°. Automated model building with 
BUCCANEER was used to build ~65% of the structure*”, This partial model from 
the SeMet data was used as a molecular replacement model for the native data, and 
the remaining portion of the protein was built by hand (excluding the thioesterase 
domain, which was unresolved and constitutes about 19%). This model was built 
and refined iteratively using COOT* and PHENIX refine. TLS refinement™ was 
used in final stages with groups consisting of residues 5:186, 187:429, 430:444, 
445:857, 858:964, 965:971, and 972:1045. 

The final model showed density for the condensation, adenylation, and PCP 
domains of EntF; no density was observed for the thioesterase domain. Diffraction 
and refinement statistics are presented in Extended Data Table 2. 

In general, the overall quality of the density was weaker for the N-terminal 
subdomain of the condensation domain, residues 1-186, probably reflecting the 
higher mobility of this region of the protein. The average B-factors for different 
regions of the protein (Extended Data Table 2) support this conclusion. 
Negative-stain electron microscopy analysis of EntF. EntE, purified as described 
above, was prepared for electron microscopy using the conventional negative 
staining protocol*’, and imaged at room temperature with a Tecnai T12 electron 
microscope operated at 120kV using low-dose procedures. Images were recorded 
at a magnification of x71,138 and a defocus value of ~1.5j1m on a Gatan US4000 
CCD camera. All images were binned (2 pixels x 2 pixels) to obtain a pixel size of 
4.16 A on the specimen level. Particles were manually excised using e2boxer (part 
of the EMAN 2 software suite)*”. Two-dimensional reference-free alignment and 
classification of particle projections was performed using ISAC“. A total of 17,431 
projections of EntF were subjected to ISAC, producing 133 classes consistent over 
two-way matching and accounting for 5,344 particle projections (Extended Data 
Fig. 8B). 

Synthesis of serine adenosine vinylsulfonamide. Ser-AVS was synthesized using 
the protocol summarized in (Extended Data Fig. 6). All reactions were performed 
under an inert atmosphere of dry Ar in oven-dried (150°C) glassware. 'H and °C 
NMR spectra were recorded on a Varian 600 MHz spectrometer. Proton chemical 
shifts are reported in parts per million from an internal standard of residual chlo- 
roform (7.26 p.p.m.) or methanol (3.31 p.p.m.), and carbon chemical shifts are 
reported using an internal standard of residual chloroform (77.3 p.p.m.) or meth- 
anol (49.1 p.p.m.). Proton chemical data are reported as follows: chemical shift, 
multiplicity (s, singlet; d, doublet; t, triplet; m, multiplet; br, broad), integration, 
coupling constant. High-resolution mass spectra were obtained on an Agilent TOF 
II time of flight/mass spectrometry (TOF/MS) instrument equipped with either an 
ESI or APCI interface. Thin-layer chromatography (TLC) analyses were performed 
on TLC silica gel 60F254 from EMD Chemical, and were visualized with ultraviolet 
light or 10% PMA solution. Purifications were performed by flash chromatography 
on silica gel (Dynamic Adsorbents, 60A). 

Materials. Chemicals, reagents, and solvents were purchased from Sigma Aldrich, 
Chem-Impex, or Acros Organic Fischer, and were used as received. An anhydrous 
solvent-dispensing system (J. C. Meyer) using two packed columns of neutral alu- 
mina was used for drying tetrahydrofuran (THF), Et,O, while two packed columns 
of molecular sieves were used to dry DMEF and the solvents were dispensed under 
argon. Compound 1 was purchased from Chem-Impex and used as received. 
Compounds 2 (ref. 41) and 4 (ref. 10) were synthesized according to the reported 
procedures. 

tert-Butyl (R,E)-4-(2-(N-(tert-butoxycarbonyl)sulfamoyl)vinyl) -2,2- 
dimethyloxazolidine-3-carboxylate (3). To a solution of tert-butyl (2) (395 mg, 
1.0 mmol, 2.0 equiv) in 1:3 DMF-THF (4ml) at —78°C, was added a 1 M solu- 
tion of LIHMDS in THF (2.0 ml, 4.0 equiv) dropwise over 15 min and the solu- 
tion was stirred at -78 °C for an additional 15 min. Next, Garner’s aldehyde (1) 
(115 mg, 0.5 mmol, 1.0 equiv) in THF (1 ml) was added to the reaction over 
15min. The solution was gradually warmed to 25°C and stirred for 15h. The 
solvent was removed in vacuo and the mixture was taken up in H2O (30 ml). 
The pH was adjusted to 3-4 with 1 N aqueous HCl, then was extracted with 
ethyl acetate (EtOAc) (3 x 20 ml). The combined organic layers were washed 
with HO (30 ml), saturated aqueous NaCl (30 ml), dried (MgSO,), and con- 
centrated. Purification by flash chromatography (10% EtOAc-hexane to 50% 
EtOAc-hexanes) afforded the title compound 3 as colourless oil (150 mg, 74%): 
retardation factor (Rs) = 0.50 (50:50 EtOAc-hexanes); [a] +0.9 (c 0.02, CH2Cl,); 
*H NMR (600 MHz, CD30D) 'H NMR (600 MHz, CD3OD) 6 1.45 (s, 3H), 1.48 


(m, 9H), 1.51 (s, 9H), 1.60 (s, 3H), 3.83-3.85 (m, 1H), 4.15 (dd, J= 12.0, 6.0 Hz, 
1H), 4.56-4.58 (m, 1H), 6.64 (d, J=18 Hz, 1H), 6.77-6.81 (m, 1H); "C NMR 
(150 MHz, CD3OD) 6 28.41, 28.47, 28.80, 28.81, 58.7, 68.3, 84.19, 84.22, 95.8, 130.6, 
145.7, 152.2, 152.7; HRMS (ESI-) calculated for C}7H29N207S [M — H]~ 405.1701, 
found 405.1721 (error 4.9 p.p.m.). 

Ser-AVS. To a solution of N°, N°-bis(tert-butoxycarbonyl)-2/,3'-O-isopropy- 
lideneadenosine (4) (73 mg, 0.14 mmol, 1.1 equiv), vinylsulfonamide (3) (52 mg, 
0.13 mmol, 1.0 equiv) and PPh; (56 mg, 0.21 mmol, 1.7 equiv) in THF (1 ml) at 
0°C, was added a solution of DIAD (42 il, 0.21 mmol, 1.7 equiv) in THF (1 ml) 
over 1h using a syringe pump. The solution was gradually warmed up to 23°C 
and stirred overnight. The mixture was filtered over a short pad of silica gel, 
which was washed with 20% EtOAc-hexanes (100 ml). The filtrate was concen- 
trated to afford crude 5 (Re=0.45, 50:50 EtOAc-hexanes), which was used in the 
next step without further purification. To a solution of crude 5 from the previous 
step was added 80% aqueous trifluoroacetic acid (1 ml) at 0°C. The solution was 
stirred for 6h at 0°C then concentrated. Recrystallization from 1:20 MeOH-Et,O 
(5 ml) afforded the title compound (32 mg, 47%) as colourless film: [a] -10.3 
(c 0.600, MeOH); 'H NMR (600 MHz, CD;OD) é 3.30-3.39 (m, 2H), 3.67-3.70 
(m, 1H), 3.83 (dd, J=11.6, 4.1 Hz, 1H), 4.05-4.08 (m, 1H), 4.22~4.25 (m, 1H), 
4,34—4.35 (m, 1H), 4.77-4.81 (m, 1H), 5.94 (d, J=6.2 Hz, 1H), 6.70 (dd, J=15.4, 
6.5 Hz, 1H), 6.77 (d, J= 15.4 Hz, 1H), 8.27 (s, 1H), 8.29 (s, 1H); ®\C NMR (150 MHz, 
CD3OD) 6 45.8, 54.1, 62.3, 72.9, 74.8, 85.8, 91.7, 121.3, 134.8, 137.0, 143.2, 149.9, 
151.3, 156.1; HRMS (ESI+) calculated for C,4H22N7O¢S [M+ H]* 416.1347, found 
416.1339 (error 1.9 p.p.m.). 

Kinetic analysis of AB3403. Substrate preference for the adenylation domain of 
holo-AB3403 was established by the pyrophosphate exchange assay” allowing radi- 
olabelled PP; to be incorporated into ATP in the reverse reaction. One micromolar 
holo-AB3403 was added to 2mM ATP, 0.2mM NaPPi, 50mM HEPES (pH 7.5), 
100mM NaCl, 10mM MgCh, 0.15 Ci [°P]PPi, and 5 mM substrate. Reactions 
(10011) were incubated for 10 min at 37°C, then quenched with 0.5 ml 1.2% char- 
coal, 0.1 M unlabelled PPi, and 0.35 M perchloric acid. The charcoal was pelleted 
by centrifugation, washed twice with 1 ml HO, and resuspended in 0.5 ml H2O 
for scintillation counting. 

To determine the apparent kinetic constants for ATP and glycine for the 
holo-AB3403 adenylation domain, the NADH* consumption assay monitored at 
A340 nm (refs 43, 44) with full-length AB3403. Hydroxylamine was used as a sur- 
rogate for the pantetheine in the second partial reaction to displace AMP for use 
in the coupled consumption assay*®. Standard reactions contained 50 mM HEPES 
(pH 7.5), 15mM MgCh, 2mM ATP, 3mM phosphoenolpyruvate, 0.2mM NADH*, 
5U myokinase, 5 U pyruvate kinase, 6.5 U lactate dehydrogenase, and 150 mM 
buffered hydroxylamine. Apparent kinetic constants were determined by varying 
concentrations of ATP or glycine with the one or the other in excess. Reactions 
were initiated by the addition of 0.001 mM enzyme. Calculations were done using 
PRISM software. 
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EYKEQDKIRGFDLTRDIPMRAATFKKAEESFEWVWSYHHIILDGWCFGIVVQDLFKVYNALREQKPYSLPPVKPYKDY IKWLEKQDK- ---QASLRYWREYLEGFEGQTTFA 


* * * * * * 


PAPLPGRSASADILRLKLEFTDGEFRQLATQLSGV-QRTD-LALALAALWLGRLCNRMDYAAGFIFMRR--LGSAALTATGPVLNVLPLGIHIAAQETLPELATRLAAQLKK 
LDYPRPAVQQHKGSSLVFRVSESVSSGLVNLAKDYEITLFGLVLSGFYVLLHKLSNENNLVIATPVAGR--LERSLRNALGQFVNT IATHMDIDADQTLRQFTQQVQEQLRQ 
EQRKK-QKDGYEPKELLFSPSEAETKAFTELAKSQHT TLSTALQAVWSVLISRYQQSGDLAFGTVVSGRPAEIKGVEHMVGLF INVVPRRVKLSEGITFNGLLKRLQEQSLQ 


x x * * * 
MRRHQRYDAEQIVRDSGRA- - -AGDEPLFGPVLNIKVFDY-------------- QLDIPDVQAQTHTLATG-PVNDLELALFPDVHGDLSTETLANKQRYDEPTLIQHAERL 
SLKHQKTAFSRVVEAVSPKRDGSINPLAQIGMFWERLGGMDEFKELLLPIQTPATLVGQDLTLGSFPVRQQEGQLDITLEMGGEYQGELVGVLKYNTDLFSAQSAENMVQLL 
SEPHQYVPLYDIQSQADQ- ----- PKLIDHIIVFENYPLQDA--------- KNEESSENGFDMVDVHVFEK-SNYDLNLMASP--GDEMLIKLAYNENVFDEAFILRLKSQL 


* * * * ** RK * * * * * * * 


KMLTAQFAADPALLCGDVDIMLP-GEYAHVAQLNATQVEIPETTLSALVAEQAAKTPDAPALADARYLFSYREMREQVVALANLLRERGVKPGDSVAVALPRSVFLTLALHA 
QAVLSEMVAHPERKIVELDIAPDYKDGIQFEALRGKATDYAQHDL FAMILKQIDERGDNHALTSNDHTVSYRELGQHIAGIAEYLRAHGITQGDRVGLMLDRTALLPAAILG 
LTAIQQLIQNPDQPVSTINLVDDREREFLLTGLNPPAQAHETKPLTYWFKEAVNANPDAPAL TYSGQTLSYRELDEEANRIARRLQKHGAGKGSVVALYTKRSLELVIGILG 


KKK OK OK * * * * ** RK KK KK * 


IVEAGAAWLPLDTGYPDDRLKMMLEDARPSLLITTDDQLPRFSDV-PNLTSLCYNAPLT --PQGSAPLQLSQPHHTAYIIFTSGSTGRPKGVMVGQTAIVNRLLWMQNHY PL 
IWAAGAAYVPLDPNFPTERLQNITEDAEPKVILTQTELMDGLN---VSVPRLDINQAGV--VALEQVRETLAFGDIAYVMYTSGSTGKPKGVRIGHPSTINFLLSMNDRLQV 
VLKAGAAYLPVDPKLPEDRISYMLADSAAACLLTHQEMKEQAAELPYTGTTLFIDDQTRFEEQASDPATAIDPNDPAY IMYTSGTTGKPKGNITTHANIQGLVKHVD-YMAF 


#* * #* ETETSTR 
TGEDVVAQKTPCSFDVSVWEFFWPF IAGAKL VMAEPEAHRDPLAMQQF FAEYGVTTTHFVPSMLAAFVASLTPQTARQSCATLKQVFCSGEALPADLCREWQQLTGA-PLHN 
TTETQLLAITTYAFDISILELLIPLMYGGVVHVCPREVSQDGIQLVDYLNAKSINVLQATPATWKMLLDSEWSG------ NAGLTALCGGEALDTILAEKLLGKVG--CLWN 
SDQDTFLSVSNYAFDAFTFDFYASMLNAARLITADEHTLLDTERLTDLILQENVNVMFATTALFNLLTDAGEDWM- - - -- KGLRCILFGGERASVPHVRKALRIMGPGKLIN 


Ok * * * * * * * * * * ** * * * Ok * a 


LYGPTEAAVDVSWY PAFGEELAQVRGSSVPIGYPVWNTGLRILDAMMHPVPPGVAGDLYLTGIQLAQGYLGRPDL TASRF IADPFAPGE-RMYRT GDVARWLDNGAVEYLGR 
VYGPTETTVWSSAARITDA------- KYIDLGEPLANTQLYVLDEQQRLVPPGVMGELWIGGDGLAVDYWQRPEL TDAQFRTLPSLPNAGRLYRTGDKVCLRTDGRLTHHGR 
CYGPTEGTVFATAHVVHDLPDSI ---SSLPIGKPISNASVYILNEQSQLQPFGAVGELCISGMGVSKGYVNRADLTKEKFIENPFKPGE-TLYRTGDLARWLPDGTIEYAGR 


OK RRKK REKK KK ** * * * KK OKRK * * * * * * O* 


SDDQLKIRGQRIELGEIDRVMQAL PDVEQAVTHACVINQAAATGGDARQLVGYLVSQSGLPLDTSALQAQLRETLPPHMVPVVLLQLPQLPLSANGKL -DRKALPLPELK-A 
LDFQVKIRGFRIELGETENVLKQIDGITDAVVLVKTT------ GDNDQKLVAYVTG- --QELDIAGLKKNLQTHLPAYMVPSAF IRLDEFPMTANKKL-DRKAFPEPIFE-Q 
IDDQVKIRGHRIELEEITEKQLQEYPGVKDAVVVADRHE - ----- SGDASINAYLVN--RTQLSAEDVKAHLKKQLPAYMVPQTFTFLDELPLTTNGKVNKR-LLPKPDQDQL 


i a a oe...) a i | 
QAPGRAPKAGSETITAAAFSSLLGCDVQDADADFFA LLAMKLAAQLSRQVARQVTPGQVMVASTVAKLATIIDAEEDSTRRMGFETILPLREGNG--PTLFCFHPAS 
SNDYVAPRDPIETELCTTFEQILSVKRVGIHDDFFE LLAVKLVNHLKKAFGTELSVALLAQYSTVERLGETIRENKE ----IKPSIVIELRR-GTYEQPLWLFHPIG 
AEEWIGPRNEMEETIAQIWSEVLGRKQIGIHDDFFA LKAMTAASRIKKELGIDLPVKLLFEAPT IAGISAYLKNGGS - - --DGLQDVTIMNQDQE --QIIFAFPPVL 


GSTFCYMEL SRHLNPNRTLRATQSPGLIEADAAEVATEEMATLY IAEMQKMQPQGPY FL GGWCF GGATAYETSRQLRQMGQQVT GIVMIDTRAPIPENVPEDADDAMLL SWF 


csr HULA LAATSPLTADAAEVIEEMATLYTAEIQONQGPYF_ CONS EATAYETSRLACHEQQTCIMOTREDLPWPEDADDL SW 
GYGLMYQNLSSRL-PSYKLCAFDFIEE---------- EDRLDRYADLIQKLQPEGPLTLFGYSAGCSLAFEAAKKLEEQGRIVQRIIMVDSYKKQGVSDLDGRTVESDVEAL 


AEINR--------------------------- EREAFLAAQQGSTSTELFTTIEGNYADAVRLLTTAHSV-PFDGKATLFVAERTLQEGMSPERAWSPWIAELDIYRQDCAH 
ARDLAAPYGKKLTIPAQYLRELSPDQMFDHVLKEAKAINVLPLDADPSDFRLYFDTYLANGIALQTYFPE-PEDFPILLVKAKDEQEDFGESLGWDQLVKDTLTQVDLPGDH 
RSSaasS53SsS5=- = S555 555s5S=555 EALMNVNRDNEALNSEAVKHGLKQKTHAFYSYYVNLISTGQVKADIDLLTSGADFDIPEWLASWEEATTGVYRMKRGFGTH 
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Extended Data Figure 1 | Structure-based alignment of EntF, AB3403, 
and SrfA-C. Condensation, adenylation, PCP, and thioesterase domains 
are represented with bars in grey, pink, green-cyan, and blue. Conserved 
motifs and catalytically important residues are highlighted with the same 
colours, including the HHxxxD motif of the condensation domains, the 


aspartic acid hinge that separates the N- and C-terminal subdomains of the 


adenylation domain, the GGHS motif that is the site of pantetheinylation 


in the PCP, and the catalytic nucleophile of the thioesterase domain. 
The SrfA-C, AB3403, and EntF proteins share approximately 26% 
sequence identity. The adenylation and PCP domains are more well- 
conserved, sharing ~35% identity, whereas the condensation (21%) and 
thioesterase (25%) domains are less well conserved. Domain boundaries 
are described in the table below. 
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Pyrophosphate Exchange Assay 


Keat Ku Keat/Ku 
Substrate (min-*) (uM) (uM: min) 
Glycine 3.6 1117+16 3.2 x 102 
ATP 25 375+ 11 6.6 x 10°? 


K L MN P QR S T V WY 4CB 4HB 
Amino Acid Code 


Extended Data Figure 2 | Substrate specificity of full-length AB3403. replicates with each substrate; results were recorded as micromoles of 
Amino-acid specificity of AB3403 was recorded for all 20 proteinogenic radiolabelled ATP incorporated per minute per milligram of enzyme. 
amino acids, as well as 4-chlorobenzoate (4CB) and 4-hydroxybenzoate Apparent kinetic constants are also shown for ATP and glycine calculated 
(4HB). Average values and standard deviations are shown for three from duplicate measurements for four to six substrate concentrations. 
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a AB3403 Condensation Domain 
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AB3403 Adenylation Domain 
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Extended Data Figure 3 | Stereo representations of electron density nucleotide binding pocket of AB3403 bound to glycine and AMP. Stereo 
figures shown in Fig. 2. To better visualize the active sites and electron representation of electron density shows the AMP, glycine, and Mg™ present 
density quality, stereo figures are included in the extended data. In in the active site of the adenylation domain. Ligand carbon atoms are in 

all panels, density is shown with coefficients of the form (F,— F.) green, carbons of residues within 5 A of inhibitor in grey, nitrogen in blue, 
calculated before inclusion of ligands and contoured at 3c. a, Stereo oxygen in red, phosphorus in orange, and the Mgt cofactor in purple. 
representation of electron density of AB3403 condensation domain c, Stereo representation of the electron density shows the phosphopantethine 
shows the phosphopantethine on Ser1006 approaching His145 within the on Ser1006 covalently attached to the Ser-AVS inhibitor in the active site 
condensation domain pocket. Inhibitor carbon atoms in green, carbons of the adenylation domain. Inhibitor carbon atoms in green, carbons of 

of residues within 5 A of inhibitor in grey, nitrogen in blue, oxygen in residues within 4 A of inhibitor in grey, nitrogen in blue, oxygen in red, 

red, sulphur in yellow, and water in light blue. b, Electron density of the phosphorus in orange, sulphur in yellow, and water in light blue. 
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Extended Data Figure 4 | Comparison of AB3403 and SrfA-C PCP- with a white condensation domain and a green-cyan PCP. SrfA-C is shown 


condensation domain interaction. Stereo representation illustrating with a yellow condensation domain and a pale blue PCP. The pantetheine 
different orientations of the PCP domains of SrfA-C and AB3403 relative of AB3403 is shown bound to Ser1006. The position of Ser1003, mutated 
to the condensation domains with which they interact. AB3403 is shown to an alanine residue in SrfA-C, is also highlighted. 
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AB3403 


Extended Data Figure 5 | Comparison of AB3403 thioesterase domain EntF thioesterase domain and its holo-PCP, trapped crystallographically, 


to the functional PCP-thioesterase interaction. Stereo representation illustrates that the same face of the thioesterase domain interacts 
of the thioesterase (blue) domain of AB3403 interacts with the back face functionally (PDB 3TE)J). A 28-residue insertion of AB3403 is coloured 
of the PCP domain in AB3403. The functional interaction between the yellow. 
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Extended Data Figure 6 | Synthesis of Ser-AVS. The Ser-AVS probe was (E)-vinylsulfonamide 3. Mitsunobu coupling of 3 with bis-Boc adenosine 
synthesized following similar protocols described elsewhere***. Garner’s 4 afforded 5, which was globally deprotected using 80% aqueous 
aldehyde 1 was coupled with 2 using LiHMDS to exclusively furnish the trifluoroacetic acid to yield Ser-AVS. 
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th 
Extended Data Figure 7 | Electrophoretic mobility of EntF. a, Native Lane 2: EntF incubated four times with Ser-AVS inhibitor. Lane 3: Life 
gel electrophoresis. Lane 1: EntF. Lane2: EntF incubated with fourfold Technologies Mark12 labelled in kilodaltons. The native gel shows the 
molar excess of Ser-AVS inhibitor. Lane 3: EntF Crystals. Lane 4: novex inhibited EntF in a more compact conformation compared with EntF 
NativeMark labelled in kilodaltons. b, Denaturing gel electrophoresis without the inhibitor. 


using loading buffer with SDS and 3-mercaptoethanol. Gel lane 1: EntF. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


30 nm 


Extended Data Figure 8 | Negative-stain electron microscopy analysis of EntF. a, Raw electron microscopy image of negative-stained EntF. b, Class 
averages of EntF particles. 
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Extended Data Table 1 | Diffraction data statistics and refinement statistics for AB3403 


SeMet_AB3403 AB3403 Liganded AB3403 
PDB Code 4ZXH 4ZHI 
Beamline APS 23-ID-B APS 23-ID-B APS 23-ID-B 
Wavelength (A) 0.9793 0.9796 1.0332 
Space group P432,2 P432,2 P432,2 
Unit cell a, b, c (A) 116.19 116.19 348.61 116.19116.19 348.61 116.10 116.10 342.02 
Molecules per ASU 1 1 1 
Resolution range (A) 29.75-3.35 49.80-2.70 45.03-2.90 
(3.52-3.35) (2.79 - 2.70) (3.00 — 2.90) 

Total reflections 137397 (16096) 416923 (21743) 257325 (25582) 
Unique reflections 34599(4299) 66559 (6495) 52900 (5187) 
Multiplicity 4.0 (3.7) 6.3 (3.4) 4.9 (4.9) 
Completeness (%) 98.9 (94.6) 99.96 (100.00) 99.99 (100.00) 
Mean I/sigma(l) 11.9 (3.5) 9.91 (2.49) 8.47 (2.19) 
Rmerge 0.090 (0.359) 0.125 (0.511) 0.130 (0.641) 
Rmeas 0.116 0.143 0.162 
CC1/2 0.993 (0.798) 0.991 (0.685) 0.991 (0.635) 
CC* 0.998 (0.937) 0.998 (0.902) 0.998 (0.881) 
Structure Refinement 
Reactor 0.179 (0.248) 0.174 (0.307) 
Rene 0.234 (0.322) 0.225 (0.369) 
No. atoms 10301 10198 
RMSD bond distances (A) 0.009 0.009 
RMSD bond angles 1.18 1.18 
Wilson B-factor (A?) 41.15 50.83 
Average B-Factor (A”) 

Protein 46.00 53.60 

Ligand 54.40 55.30 
Ramachandran analysis 

Favored (%) 97.0 96.0 

Allowed (%) 2.3 3.4 

Outliers (%) 0.7 0.6 


Values in parentheses are for the highest resolution shell. 
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Extended Data Table 2 | Diffraction data statistics and refinement statistics for EntF 
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SeMet EntF EntF 
PDB Code 4ZXJ 
Beamline APS 23-ID-B APS 23-ID-B 
Wavelength 0.9796 1.0332 
Space group P4,2,2 P4,2,2 


Unit cell a, b, c (A) 
Molecules per ASU 
Resolution range (A) 


127.55 127.55 186.72 
1 
60 — 2.9 (3.0—2.9) 


127.71 127.71 186.94 
1 
8131 = 280.5 =26) 


Total reflections 152578 (15129) 175399 (17288) 
Unique reflections 34693 (3380) 38753 (3800) 
Multiplicity 4.4 (4.5) 4.5 (4.5) 
Completeness (%) 99.66 (99.41) 99.96 (99.92) 
Mean I/sigma(l) 9.86 (2.49) 9.85 (2.11) 
R-merge 0.1153 (0.6165) 0.0979 (0.6484) 
R-meas 0.1312 0.1109 

CC1/2 0.995 (0.598) 0.997 (0.629) 
Ccc* 0.999 (0.865) 0.999 (0.879) 
Structure Refinement 

Reactor 0.183 (0.290) 
Riree 0.230 (0.324) 


No. protein/ligand atoms 

RMSD bond distances (A) 

RMSD bond angles 

Wilson B-factor (A*) 

Average B-Factor (A*) 
Protein 
Ligand 
Condensation N-terminal Subdomain 
Condensation C-terminal Subdomain 
Adenylation N-terminal Subdomain 
Adenylation C-terminal Subdomain 
PCP Domain 

Ramachandran analysis 
Favored (%) 
Allowed (%) 
Outliers (%) 

Molprobity ClashScore 


7898/49 
0.008 
1.23 
62.88 


74.6 
49.8 
90.5 
84.8 
54.9 
54.8 
78.8 


95.0 
4.13 
0.87 
6.03 


Values in parentheses are for the highest resolution shell. 
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Synthetic cycle of the initiation module of a 
formylating nonribosomal peptide synthetase 


Janice M. Reimer!*, Martin N. Aloise!*, Paul M. Harrison? & T. Martin Schmeing! 


Nonribosomal peptide synthetases (NRPSs) are very large proteins 
that produce small peptide molecules with wide-ranging biological 
activities, including environmentally friendly chemicals and 
many widely used therapeutics!. NRPSs are macromolecular 
machines, with modular assembly-line logic, a complex catalytic 
cycle, moving parts and many active sites”*. In addition to the 
core domains required to link the substrates, they often include 
specialized tailoring domains, which introduce chemical 
modifications and allow the product to access a large expanse 
of chemical space**. It is still unknown how the NRPS tailoring 
domains are structurally accommodated into megaenzymes or how 
they have adapted to function in nonribosomal peptide synthesis. 
Here we present a series of crystal structures of the initiation 
module of an antibiotic-producing NRPS, linear gramicidin 
synthetase™®. This module includes the specialized tailoring 
formylation domain, and states are captured that represent every 
major step of the assembly-line synthesis in the initiation module. 
The transitions between conformations are large in scale, with 
both the peptidyl carrier protein domain and the adenylation 
subdomain undergoing huge movements to transport substrate 
between distal active sites. The structures highlight the great 
versatility of NRPSs, as small domains repurpose and recycle their 
limited interfaces to interact with their various binding partners. 
Understanding tailoring domains is important if NRPSs are to be 
utilized in the production of novel therapeutics. 

Tailoring domains embedded within NRP%Ss are vital for the pro- 
duction and bioactivity of the nonribosomal peptide (NRP) products 
of these synthetases‘. Tailoring domains exist in addition to the core 
NRPS adenylation (A), peptidyl carrier protein (PCP) and condensa- 
tion (C) domains, which a module requires to add an amino acid to 
the growing NRP: the A domain selects, activates and transfers the 
substrate amino acid to the PCP domain, which transports it to the C 
domain for peptide bond formation! (Fig. 1 and Extended Data Fig. 1). 
Tailoring domains are common in NRPSs**, for example, cyclosporin 
synthetase contains methyltransferase domains’; daptomycin (Cubicin) 
synthetase, epimerization domains’; bacitracin (BACiiM) synthetase, 
a heterocyclization domain’; valinomycin synthetase, ketoreductase 
domains!?; bleomycin synthetase, an oxidase domain"; and soframy- 
cin synthetase, a reductase domain’”. These domains enable key func- 
tionalities of the NRP by, for example, providing protease resistance, 
enabling novel interactions, improving affinity by limiting NRP con- 
formational flexibility, or allowing the NRP to assume its active confor- 
mation. Linear gramicidin synthetase (comprised of LgrA, LgrB, LgrC, 
and LgrD) was previously shown to contain an active formylation (F) 
domain as the first domain of its F-A-PCP initiation module®® (Fig. 1). 
F domains are homologous to formyltransferase proteins that modify 
substrates in three diverse pathways: ribosomal translation’, purine 
anabolism" and bacterial outer membrane synthesis’. The LgrA initia- 
tion module must formylate its substrate for linear gramicidin synthesis 


to proceed’ (Fig. 1), and this formyl group is essential for the clinically 
important antibacterial activity of gramicidin'®. Gramicidin mole- 
cules form head-to-head dimers through the formyl] group to make a 
6-helical pore in gram-positive bacterial membranes. This pore allows 
free passage of monovalent cations, destroying the ion gradient and 
killing bacteria. 

We have determined four independent crystal structures of the ini- 
tiation module of LgrA at 2.5, 2.6, 2.8, and 3.8A resolutions (Extended 
Data Table 1 and Extended Data Fig. 2), showing four different func- 
tional conformations: the A domain open (substrate binding), the A 
domain closed (adenylation), thiolation and formylation states (Fig. 2, 
Extended Data Fig. 3 and Supplementary Video 1). Our data augment 
the existing structural knowledge of NRPSs (reviewed in ref. 2), by 
visualizing the structure of an NRPS module that includes a tailoring 
domain, showing how the tailoring domain is incorporated into and 
used as part of an NRPS, observing several functional states (open, 
closed, thiolation) in a single protein rather than over different excised 
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Adenylation 


SH 
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AMP 
F A PCP Formylation F A PCP 
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Figure 1 | A schematic of the action of the linear gramicidin synthetase 
initiation module. a, The F~-A-PCP initiation module is the first module 
of LgrA, the dimodular F-A~PCP-C-A-PCP-E* NRPS protein in 

the LgrA-E synthetic cluster (E*, inactive epimerization domain). The 
initiation cycle begins with valine selection and adenylation followed by 
thiolation onto the PPE arm of the PCP domain. The F domain formylates 
PCP-PPE-Val before it is brought to be the donor in the condensation 
reaction of the downstream module. b, Chemical structure of linear 
gramicidin A. 
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Open state 


Agup Movement 


Figure 2 | Crystal structures representing the steps of the synthesis 
cycle in the LgrA initiation module. a-d, The F-A-PCP LgrA initiation 
module in open (a), closed (b), thiolation (c) and formylation (d) states. 
(The PCP domain is not necessary for the open and closed states and is 
disordered in b and c.) The transition between thiolation and formylation 


mono- and didomains*!7~?! 


(formylation). 

The F domain is connected to the rest of the F-A-PCP LgrA ini- 
tiation module through an interface with the A domain (Fig. 3) that 
buries 830 A? of surface area. This is distinct from the C-A interface in 
C-A-PCP elongation and termination modules” (Extended Data 
Fig. 4). The F-A interface appears sufficient to maintain these domains 
in a very elongated conformation (Fig. 2). Across all nonequivalent 
molecules in the crystals, the relative orientation between the two 
domains varies only by ~5°, and our small angle X-ray scattering anal- 
ysis indicates that this extended conformation is representative of the 
initiation module in solution (Extended Data Fig. 5). This architecture 
means that the adenylation active site and the formylation active site 
are always ~50A apart, necessitating that the valine substrate travels 
a large distance between subsequent steps in synthesis. Accordingly, 
positions of the PCP domain and the A,,, domain (C-terminal portion 
of the A domain) change markedly in the progression of the module 
through functional states. 

The NRPS assembly-line process (Fig. 1, Extended Data Fig. 1 and 
Supplementary Video 2) begins with ATP and valine binding to an 
open conformation of the A domain”? (Fig. 2a and Supplementary 
Video 1). The A domain closes on substrate binding by rotating 
the Asup by ~30° to catalyse formation of the valine adenylate!”!® 
(Fig. 2b). Next, the thiolation reaction transfers the valine from the 
adenylate to the thiol of the PCP domain phosphopantetheine arm 
(PPE). We accessed this state by attaching a non-hydrolysable ana- 
logue’? of the product of the reaction, Val-NH-PPE, to the PCP 
domain. The resulting structure shows the known 140° rotation of 


, and visualizing a novel functional state 
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PCP movement 


states requires large rigid body movements of both the A, and PCP 
domains. e, The PCP domain rotates 75° and translocates its centre of mass 
by 61 A. The PPE arm attachment point, Ser729, moves 52 A, and some 
residues move >80 A. f, The Ag,» domain rotates 180° and translocates its 
centre of mass by 21 A. 


the Ayp!??! and the product Val-NH-PCP still bound to the active 
site (Fig. 2c and Extended Data Fig. 2e). The PCP then transports its 
valine 50 A between the A and F domain active sites to accept a formyl 
group. Our next structure (Fig. 2d) shows that to achieve this, the PCP 
domain makes a very large movement ofa rigid ~75° rotation and 61A 
translocation (Fig. 2e). The ~10-residue linker between the A and PCP 
domains is not nearly sufficient to span the 55 A travelled by the first 
residue of the PCP domain; accordingly, the A,,, domain undergoes a 
full 180° rotation and 21 A translocation to allow the PCP domain to 
bind the F domain (Fig. 2f). There, Val-PCP accepts a formyl (f) group 
from the donor cofactor formyltetrahydrofolate (fTHF) (Extended Data 
Fig. 2f) onto its amino group®®. The PCP then moves the formyl-valine 
to the next module, where the condensation domain of that module 
will catalyse peptide bond formation between fVal-PCP and its Gly- 
PCP2, making the first peptide bond of linear gramicidin and liberating 
the PCP to participate in the next round of reactions® (Supplementary 
Video 2). 

How did the F domain become a functional NRPS domain? The 
F domain of LgrA was fused into an existing NRPS’, and we suggest 
that the pre-transfer source was a single-domain formyltransferase 
from a distantly related bacterium with a signature of missing helix a2 
and strand (33. As the high incidence of horizontal transfer (Extended 
Data Fig. 6) is consistent with conferring a competitive advantage, and 
bacteria possessing formyltransferases similar to the F domain also 
have canonical transfer RNA and phosphoribosylglycinamide form- 
yltransferases, it is likely that the pre-transfer formyltransferase per- 
formed the remaining known formyltransferase function, that is, sugar 
formylation for cell wall synthesis. After fusion, the F-A-PCP initiation 
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Figure 3 | Interdomain interfaces of the initiation module. a, The 

F domain is fused onto the A domain and forms a small hydrophobic core 
(Extended Data Fig. 8). b, Interaction of the PCP domain with A, and 

F domains in the formylation state. The A,,, domain creates an electrostatic 
platform for the PCP domain. The PCP domain binds to F domain 
hydrophobic residues Leu127 (often Lys or Glu in formyltransferases) 

and Met178 (in the C terminus of the F domain that is not similar to 
formyltransferases). The PPE phosphate interacts with Arg170 (often Glu, 
Ser or Asn in formyltransferases) and Asn177 (usually Glu, Asp or Met in 
formyltransferases). 


module evolved rapidly (Extended Data Figs 6 and 7). The fold of the 
first 171 amino acids of LgrA is conserved with the sugar formyltrans- 
ferases, leaving only residues 172-179, including a single a-helix, as 
a new structural element and link to the A domain (Extended Data 
Fig. 8). A ‘landing pad’ evolved to include a hydrophobic patch for bind- 
ing the PCP domain, and positive residues and hydrogen bond donors to 
interact with the PPE phosphate (Fig. 3 and Extended Data Fig. 8). The 
F-PCP interaction places the PPE attachment point, Ser729, an ideal 
16 A away from the fTHF in the conserved formyltransferase active site. 
Notably, this positions the Val-PPE exactly in the thymidine diphos- 
phate (dTDP)-sugar binding site of sugar formyltransferases>'>*4 
(Fig. 4a). The similar length and hydrophilic nature of the dTDP-sugar 
and Val-PPE probably enabled the F domain to formylate Val-PCP 
soon after the fusion event, before formylation was absolutely required 
for downstream peptide synthesis to proceed. 

The PCP domain interaction with the F domain is quite minimal, 
and accordingly, the A,» domain donates an additional binding 
interaction in the formylation state? (Fig. 3b). This is reminiscent of 
methionyl-tRNA™* formyltransferase (FMT), the essential bacte- 
rial two-domain formyltransferase that uses its C-terminal domain 
(FMT crp) to present the methionyl-tRNA™* to the formyltransferase 
active site? (Fig. 4b, c). This functional convergent evolution presents 
another interesting parallel to ribosomal translation, the completely 
separate macromolecular system that also synthesizes peptides. In both 
LgrA and the ribosome, a mobile carrier macromolecule (PCP domain/ 
tRNA) covalently (through thioester/ester bonds) transports an amino 
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Figure 4 | Comparisons of the F domain to sugar and tRNA 
formyltransferases. a, The binding mode for the PPE arm to the F domain 
is similar to that of sugar-dTDP in sugar formyltransferases (protein 
WI1aRD, PDB 4LY3 (ref. 15)). Note that the valine and most of the PPE arm 
(carbons shown in grey) are modelled, as they are not visible in electron 
density maps at 3.8 A resolution. Qui3NFo, 3,6-dideoxy-3-formamido- 
p-glucose. b, The Asyp domain emulates the positioning role of the 

FMT crp in methionyl-tRNAM* formyltransferases (PDB 2FMT (ref. 13)). 
Excluding the PPE arm, the PCP domain buries only 279 A? of F domain 
surface. The A,» provides an additional 345 A? of interaction surface to 
position the PCP domain. 


acid to a formyltransferase enzyme (F domain/FMT), where the carrier 
is oriented by a positioning domain (Agup/FMTcrp) to allow formyla- 
tion before acting as the first donor substrate for a peptidy] transferase 
enzyme (C domain/large ribosomal subunit). 

Observing the same protein in these conformations, including the 
novel formylation conformation, highlights the versatility of the small 
domains in NRPSs**!8?, The small ~100-residue A,,t has three dis- 
tinct roles in the cycle: providing catalytic residues for the adenylation 
reaction!®; positioning the PCP for the thiolation reaction and later for 
the formylation reaction, and bridging the distance between the active 
sites the PCP visits”)*”. The Agup uses different surfaces for each of these 
functions (Extended Data Fig. 9a). In addition, the F domain adds to 
the long list of partners with which the equally small PCP domain must 
interact (A, E C, thioesterase, all tailoring domains), and it performs 
these functions with overlapping surfaces”> (Extended Data Fig. 9b). 

Adapting a formyltransferase has further increased the function- 
ality of NRPSs. The formyl functionality seems to be useful in nonri- 
bosomal peptides, as F domains have been incorporated into NRPSs 
multiple independent times: the F domains in kolossin A synthetase”, 
anabaenopeptilide synthetases”, the oxazolomycin synthetase”’, 
and other orphan NRPSs and NRPS-polyketide-synthases arose 
from a separate fusion event with an FMT and display a different 
FMT- FMT crp-Cpartiaa-A-PCP domain sequence in their initiation 
modules®. Sampling additional chemical space can lead to novel or 
improved activity in nonribosomal peptides, which has inspired many 
bioengineering experiments on NRPSs”* aimed at meeting the dynamic 
challenges of human health. NRPSs are well placed to be engineered 
for production of new compounds because their synthetic scheme is 
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conceptually straightforward, and NRPSs already naturally produce 
many therapeutics, as well as promising NRPs like teixobactin”® and 
piperidamycin*”, two recently discovered first-in-class compounds 
with strong antibacterial activity. The structures presented here reveal 
the interface between the F and A domains and show the interac- 
tions that the PCP domain makes in the LgrA initiation module. This 
knowledge could substantially facilitate our ability to introduce an F 
domain into a foreign NRPS, and make formylation accessible in NRPS 
bioengineering. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Data reporting. No statistical methods were used to determine sample size. 
Cloning of linear gramicidin synthetase initiation module constructs. 
Genomic DNA was isolated from Brevibacillus parabrevis ATCC 8185 (Cedarlane 
Laboratories) using a GenElute Bacterial Genomic DNA Kit (Sigma-Aldrich). 
Gene constructs comprising F and A domains (F—A) and all three domains 
(F-A-PCP) were amplified by PCR from the /grA gene using the following 
primers, designed using sequence alignment with A and PCP domains of a known 
structure and the study of Marahiel and co-workers®*®. FA_ fwd: 5/-AATCA 
TCCATGGGAAGAATACTAT TCCTAACAACATTTATGAGCAAAG-3/; FA_ 
rev: 5’/-AATCATCTCGAGT TACGCATCGGCCTGCACGTCT-3’; FAT_fwd: 
5'-TGACTACCATGGGGAGAATACTATTCCTAACAACATTTATGAGC-3’; 
FAT_rev: 5’‘-CGTTGAGCGGCCGCTTGCTCCGTAAGCAGACGTTT-3’. PCR 
product for F-A-PCP was digested using Ncol and NotI (New England Biolabs) 
and ligated into a pET21-derived vector containing an N-terminal octa-histidine 
tag with a tobacco etch virus (TEV) protease cleavage site. The PCR product for 
F-A-PCP was cloned between Ncol and Not! restriction sites into a pET21-derived 
vector containing an N-terminal TEV cleavable octa-histidine tag and a C-terminal 
TEV cleavable calmodulin binding peptide (CBP) tag. 

Expression and purification of proteins. The F—A protein was expressed in 
Escherichia coli BL21 (DE3) cells. A 10 ml aliquot of overnight culture was used 
to inoculate 11 of lysogeny broth (LB) medium supplemented with 350 ,1gml! 
kanamycin. The culture was grown at 37°C to an optical density (OD¢00) of 
0.6, before inducing protein expression using 0.5 mM isopropyl 3-p-1-thioga- 
lactopyranoside (IPTG) and reducing the temperature to 16°C for 18h. Cells 
were collected by centrifugation at 4°C and resuspended in nickel binding buffer 
(2mM imidazole, 150 mM NaCl, 0.25 mM tris-(2-carboxyethyl) phosphine 
(TCEP), 50 mM Tris-HCl (pH 7.0)). The cells were lysed by sonication on ice 
and centrifuged for 30 min at 20,000g at 4°C. Clarified lysate was loaded onto 
a HiTrap IMAC FF column (GE Healthcare). F-A protein was eluted using a 
gradient of 2-250 mM imidazole. Fractions containing F—A were pooled, diluted 
tenfold with ion exchange binding buffer (0.25 mM TCEP, 20 mM Tris, pH 8.0), 
loaded onto a HiTrap Q HP column and eluted using a gradient to 100% elution 
buffer (1 M NaCl, 0.25mM TCEP, 20 mM Tris-HCl (pH 8.0)). The eluted protein 
was concentrated using a 10K MWCO Amicon Ultra-15 filtration unit (EMD 
Millipore) and subjected to gel filtration chromatography using a HiLoad 16/600 
Superdex 200 column (GE Healthcare) equilibrated with $200 buffer (150 mM 
NaCl, 0.25 mM TCEP, 20 mM Tris (pH7.0)). Protein purity was confirmed using 
SDS-PAGE and native PAGE. Pure F—A was concentrated in storage buffer (25% 
glycerol, 150 mM NaCl, 0.25mM TCEP, 20 mM Tris (pH7.0)), flash-frozen with 
liquid nitrogen and stored at —80 °C for later use. 

F-A-PCP was expressed in E. coli BL21 EntD-(DE3) cells using the same 

protocol as above. Cells were pelleted, resuspended in CBP binding buffer 
(25mM Tris-HCl (pH7.5), 150mM NaCl, 2mM imidazole (pH 8.0), 
2mM CaCl, 2mM {$-mercaptoethanol (3ME) and 0.1mM phenylmeth- 
anesulfonyl fluoride (PMSF)), sonicated and clarified by centrifugation for 
30 min at 20,000g at 4°C. Clarified lysate was loaded onto a 30 ml calm- 
odulin sepharose 4B column (GE Healthcare). F-A-PCP was eluted with 
elution buffer (25mM Tris-HCl (pH 7.5), 150mM NaCl, 2mM EGTA, 
2mM ME and 0.1mM PMSF). Protein was dialysed against binding buffer 
for a minimum of 4h before being loaded onto a 5 ml HiTrap IMAC FF 
column (GE Healthcare) charged with Ni** and equilibrated in nickel binding 
buffer. F-A—PCP was eluted using a 60 ml gradient of 0-250 mM imidazole. 
Fractions containing F-A-PCP were pooled and affinity tags were removed by 
cleavage with TEV protease at room temperature overnight using a 1:4 mg ratio 
of TEV to F~-A-PCP. Cleaved F~-A-PCP was passed back over the nickel and 
calmodulin affinity columns, with the flow-through collected, concentrated and 
applied to a HiLoad 16/600 Superdex 200 (GE Healthcare) in $200 buffer. Pure 
F-A-PCP was concentrated to 5.0mg ml in storage buffer, flash-frozen in liquid 
nitrogen and stored at —80°C. 
Substrate syntheses. Amino-coenzyme A (NH-CoA)” was prepared enzymat- 
ically starting from amino-pantetheine (WuXi AppTec) using a previously pub- 
lished protocol*! with the following modifications: one-pot synthesis was carried 
out at pH 9.0; the amounts of DPCK and ATP were doubled to 9.8 mg and 30 mM, 
respectively; and the enzymes were removed using a 1OK MWCO Amicon Ultra-15 
filtration unit (EMD Millipore). An ATP regeneration system using 0.1 mgml"! 
pyrophosphatase (Roche), 30mM phosphoenolpyruvate, and 0.1 mg ml pyruvate 
kinase (Roche) was also included. The filtrate containing NH-CoA was purified 
on a preparative reverse-phase C18 HPLC (35 ml min™ 1. 0-4 min, 0% B; 4-9 min, 
0-98% B, where A is 0.1% trifluoroacetic acid (TFA; Sigma-Aldrich) in H2O and 
B is 0.1% in acetonitrile (ACN; Sigma-Aldrich)). NH-CoA was eluted at 7 min 
and lyophilized to dryness. 
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Valine-amino-CoA (Val-NH-CoA)”’ was synthesized by coupling 1 molar 
equivalent of NH-CoA with 8 molar equivalents of tert-butoxycarbonyl- 
L-valine-N-hydroxysuccinimide ester (Boc-Val-OSu; Sigma-Aldrich) in 
N,N-dimethylformamide (DMF; Sigma-Aldrich) with 4 molar equivalents of 
N,N-diisopropylethylamine (DIPEA; Sigma-Aldrich) overnight with stirring. 
Boc-Val-NH-CoA was purified using the above chromatographic profile and 
lyophilized to dryness, then deprotected using 1.5 ml 95% TFA, 2.5% H2O and 
2.5% triisopropylsilane (TIPS; Sigma-Aldrich). The deprotection mix was agitated 
for 2h at 25°C in a thermomixer at 700 r.p.m. before being transferred to 20 ml 
ice-cold diethyl ether and incubated at —20°C for 2h. The solution was centrifuged 
and the pellet was redissolved in 5% aqueous ACN solution and purified with the 
same protocol as NH-CoA. Compound identity was verified by mass spectrometry 
and nuclear magnetic resonance (NMR) (Supplementary Data 1). 

Loading phosphopantetheinylates on the PCP domain. Unmodified F-A~PCP 
was converted to Val-NH-F-A-PCP by incubating 25 1M apo-F-A-PCP with 
5M of the promiscuous phosphopantetheinyl transferase Sfp, 0.25mM Val- 
NH-CoA, 10mM MgCh and 25 mM Tris (pH7.0) for a minimum of 4h at 25°C. 
To remove Sfp for subsequent crystallization trials, the reaction mix was loaded 
onto a Superdex S75 10/300 GL (GE Healthcare Life Sciences) equilibrated in 
25mM Tris (pH7.5), 150 mM NaCl and 2mM BME. 

SAXS. Inline size exclusion chromatography with small-angle X-ray diffraction 
(SEC-SAXS) data was collected on the G1 beamline at the Macromolecular 
Diffraction Facility at the Cornell High Energy Synchrotron Source*”?? at 
9.963 keV (1.244 A) at 7.89 x 10!! photonss~!. The X-ray beam was collimated 
to 250 x 250,1m and the sample cell path length was 2mm. The G1 beamline was 
outfitted with a GE AKTA purifier with a GE Superdex 200 5/150 GL column and 
50 ul sample loop. The column was equilibrated in 25 mM Tris (pH7.5), 150 mM 
NaCl and 2mM BME and the samples were centrifuged for 10 min before sample 
injection. Images were recorded on a Pilatus 100K-s detector and normalized using 
beam stop photodiode counts. F-A-PCP eluted in a single monomeric peak and 
eleven peak exposures were averaged using BioXTAS RAW software™. A buffer 
scattering curve was created by averaging the first eleven exposures after injection, 
and this scattering curve was subtracted from the F-A-PCP scattering curve to 
yield the corrected scattering curve for F-A-PCP. Ab initio models were generated 
by first creating pairwise distribution functions (P(r)) with GNOM*, leading to 
twenty independent bead models produced by DAMMIF*®. Models were aligned, 
averaged, and filtered using DAMAVER®” assuming P1 symmetry. All DAMMIF 
models were included in the final DAMAVER model. They had a mean normalized 
spatial discrepancy (NSD) value of 0.82 + 0.052. CRYSOL** was used to check how 
well the final model fit with our crystal structures. Flexibility was analysed using 
EOM?*"°, whereby crystal structures of F, Acores Asub and the PCP were used to 
generate a pool of 10,000 models. 

Crystallography. To obtain the crystal structures described in this study, genes 
from four species, of up to four domain constructs each (F, F-A, F-A(AAgup) and 
F-A-PCP), were cloned and assayed for heterologous expression. Purification 
was performed for all highly expressing proteins and crystallization trials were 
performed, including trials using protein with affinity tags removed or retained, 
and in the presence or absence of a variety of ligands (ATP, AMPcPP, AMP, valine, 
THE, N°-f THE, phosphopantetheine, valine amino phosphopantetheine, valine 
vinyl sulfonamide adenylate, dead-end THF analogue). Up to 4,032 crystallization 
conditions were assayed per protein sample, and gave a total of ~50 ‘hits; 6 of 
which were successfully optimized to allow structure determination. Together, 4 of 
these crystal structures (F—A in crystals of space group P4;2;2, F-A-PCP in R3:H, 
F-A-PCP-PPE-NH- Val in P2; and F~A-PCP-PPE in P3,2), plus an additional 
structure including ligands soaked into F-A P42)2 crystals, captured the states 
that represent every major step of the assembly-line synthesis in the LgrA initiation 
module and are presented here. 

The final crystallization conditions were optimized in 24-well sitting drop 
plates, with 2.1 protein sample plus 211 reservoir solution in the drop and a 500i 
reservoir volume, and are as follows. ‘FA’ and ‘FA soak: protein LgrA F-A 
(10 mg ml!) was crystallized using a precipitant solution of 2M Na-formate, 0.1M 
sodium acetate (pH 5.3) into space group P4;2;2. ‘F-A-PCP’ (open and closed 
states): protein LgrA F-A-PCP (5 mgm!) was crystallized using a precipitant 
solution of 0.92 M AmSO,, 0.1 M bis-Tris (pH5.5), 1% polyethylene glycol (PEG) 
3350 into space group R3:H. ‘F-A~PCP-NH-Val (thiolation state): protein F-A- 
PCP-PPE-NH-Val (4.7 mg ml‘) was crystallized using a precipitant solution of 
12% PEG 20,000, 0.1 M MES buffer (pH 6.7) into space group P2;. ‘F-A-PCP-PPE’ 
(formylation state): protein F-A-PCP-PPE (5.5 mg ml!) was crystallized using 
a precipitant solution of 1M AmSO,, 0.1 M bis-Tris (pH 5.5), 3% PEG 3350 into 
space group P32. 

Solutions of mother liquor with increasing amounts of glycerol (5%, 10%, 
25%) were used to replace the drop solution for cryoprotection. For soaking with the 
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N®°-fTHE, valine and AMPcPP, 10 mM of each was included in the final cryopro- 
tection solution and incubated for 30 min. (LgrA uses commercially available 
N°-f THE in addition to its natural substrate, N!°-f THE®.) Crystals were flash- 
cooled in liquid nitrogen and diffraction data sets collected at 200 K using beam- 
line 8 of the CMCF at the Canadian Light Source (A= 0.979 A) in Saskatoon, 
Canada. 

All data sets were integrated and scaled using the programs HKL-2000 (ref. 
41) and iMosflm”. Structure determination of FA in the P4,2,2 space group 
was performed by molecular replacement using a search model of the A domain 
from gramicidin Soviet synthetase’” (note that linear gramicidin and gramicidin 
Soviet are made by different NRPSs) with the A, subdomain removed and side 
chains trimmed to the 3-carbon, in the program Phaser’. Density for the F domain 
was visible in the resulting maps. Iterative building in the program COOT“ and 
refinement in the program Phenix’ produced the final F—A structure. This struc- 
ture was then used as a search model to determine the structure of F-A-PCP in 
space groups P32, R3:H, and P2; by molecular replacement using the program 
Phaser, followed by iterative building in the program COOT and refinement in the 
programs Phenix and CNS“. The highest resolution shell CC* values are: P4,2,2, 
0.845; P4;2,2 (soak), 0.897; P322, 0.883; R3:H, 0.822; and P2,, 0.822. The quoted 
resolution of each structure represents the half-data set correlation coefficient (CC 
1/2) of the diffraction data”, 

Bioinformatics. Multiple sequence alignments (MSAs) were constructed using 
Clustal Omega*® (http://www.ebi.ac.uk/Tools/msa/clustalo) and PROMALS3D” 
(http://prodata.swmed.edu/promals3d), following database searches using 
BLAST™ (http://ncbi.nlm.nih.gov/blast). MSAs were drawn/edited using Jalview*! 
(http://www.jalview.org). PHYLIP (http://evolution.genetics.washington.edu) 
was used to make neighbour-joining trees bootstrapped with 100 replicates, and 
FigTree (http://tree.bio.ed.ac.uk) was used to draw them. WebLogo™ (http:// 
weblogo.berkeley.edu) was used to draw sequence logos of residue groupings of 
interest. AmiGO* (http://amigo.geneontology.org) was used to check for experi- 
mentally characterized proteins. 

Analysis of synthesized Val-NH-CoA. Val-NH-CoA was verified by both mass 
spectrometry (calculated m/z [MH™*]: 850.2304; measured m/z [MH*]: 850.2299) 
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Extended Data Figure 1 | Synthetic cycles in canonical initiation, canonical elongation and LgrA initiation modules. Schematic diagrams comparing 
the synthetic cycle in canonical initiation and elongation modules with that in the LgrA initiation module. 
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Extended Data Figure 2 | Representative electron density. a—d, 2F, — F. PPE-NH-Val arm in the P2, (thiolation state) contoured at 3.30 (e), and 
density maps for protein in P4,22 (a), R3:H (b), P2, (c) and P32 (d) a P4)2)2 crystal soaked with N5-fTHEF, AMPcPP and valine contoured 
crystal forms contoured at lo. e, f, Unbiased F,-F. density maps for the at 2.50 (f). 
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Open state 


P4,2,2 R3:H molecule A 


Closed state 


R3:H molecule B P3,2 molecule B 


Formylation state 


P3,2 molecule A 


Extended Data Figure 3 | Crystal structures of the initiation module group P32 diffracted anisotropically to ~3.8 A resolution, but the other 
of linear gramicidin synthetase. a—f, Models of F-A (Agup disordered) higher resolution structures enabled the building of high quality models 
(a), F-A-PCP (PCP disordered) (b-d) and F-A-PCP from the four shown in d and f. 


independent crystals structures determined (e, f). The crystal with space 
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Formylation 
state 


LgrA formylating initiation module 


Condensation 
acceptor state 


SrfAC elongation module 


Extended Data Figure 4 | Comparison between the LgrA initiation 
module and the SrfAC termination module. a, b, The LgrA initiation 
module in the formylation state (a) and the termination module of 
surfactin synthase subunit 3 (SrfAC)** (b) in the state where aminoacyl- 
PCP would be positioned to act as an acceptor substrate in the 
condensation reaction (PPE arm not present). The F and C domains 

are each positioned directly N-terminal of their A domains and bury 
similar amounts of A domain surface area (829 A? and 903 A’; 


contributing residues shown in spheres), each forming ‘stable platforms’. 


Both modules use very large movements of their PCP and A,,, domains 

to bring the aminoacyl-PCP of the module to distant active sites to act as 
the acceptor substrate in an amide bond forming reaction. c, However, 

the F—A and C-A interfaces are distinct, and, if the A domains are 
superimposed, the F and C domains are only partially overlapping. This 
places their active sites in dissimilar locations, necessitating that A, and 
PCP assume different positions to deliver their substrate. The PCP domain 
in the formylation state completely overlaps with the position of the 

C domain. 
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Extended Data Figure 5 | Small-angle X-ray scattering analysis of while the remaining <40% resembled the thiolation state structure. 
F-A-PCP. a, The crystal structure in the formylation state is shown d, The calculated scattering of the EOM model has a \? = 1.028, which 
superposed on the averaged filtered ab initio small-angle X-ray scattering demonstrates that F-A-PCP has flexibility. The data are consistent with 
model generated with DAMAVER”, with a NSD value of 0.819 + 0.052. extreme flexibility for As,, and PCP domains, and limited flexibility in 
b, The calculated scattering curve for the DAMAVER is overlaid with F-Acore. €, All independent molecules from the crystal structures were 
the experimental scattering with x? = 3.010, where I represents scattering overlaid to further illustrate the flexibility of the system. f, CRYSOL*® 
intensity and q is equivalent to 4nsin(0)/.. c, To understand the was used to generate predicted scattering curves for the formylation 
flexibility of F-A-PCP better, EOM* was performed and generated five state and thiolation state crystal structures with x? = 2.12 and x?=5.54, 
different ensembles. The ensemble resembling the formylation state respectively. 


structure represented over 60% of the optimized models generated, 
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Extended Data Figure 6 | Neighbour-joining tree of LgrA F domain 
and homologues. This neighbour-joining tree of the LgrA F domain 
and homologues was made using PHYLIP (http://evolution.genetics. 
washington.edu) based on an initial Clustal Omega*® alignment of the 
closest 220 homologues of the LgrA F domain (Blast°? BLAST E-value 
<1 x 10'4). The most similar formyltransferases to the F domain share 
~45% identity, and all of these 220 formyltransferases have only inferred 
function. The tree was drawn using the program FigTree (http://tree.bio. 
ed.ac.uk), The sequences are named with their GenInfo Identifier (GI) 
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numbers. Colouring: red, Brevibacilli; green, other Firmicutes; black, other 
bacteria; blue, Archaea. The clade of the LgrA F domain is highlighted in 
grey. Only nodes with bootstraps of >50% are shown. Several horizontal 
transfer events are evident where Firmicute and non-Firmicute proteins 
cluster together with high bootstrap values (for example, >70%). The 
several horizontal transfer events of formyltransferase domains between 
Firmicutes and other bacterial groups suggest the LgrA F domain likely 
originated from horizontal transfer. 
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Extended Data Figure 7 | Neighbour-joining tree of LgrA A-PCP and Only nodes with bootstraps of >50% are shown. Three functionally 
homologues. This neighbour-joining tree of LgrA A-PCP didomains and _ characterized homologues of LgrA that are shown to be directly related are 


homologues was made for the 500 closest homologues (BLAST E-value labelled. The A-PCP portion of the initiation module is quite divergent, 
<1 x 10!*), The sequences are named with their GI accession codes. but the second module of LgrA clearly shares a common origin with 
Colouring: red, Brevibacilli; green, other Firmicutes; black, other bacteria. _ functionally characterized NRPSs in Bacilli and other Firmicutes. 


The significant clades of the LgrA A-PCP domains are highlighted in grey. 
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b, A domain residues interacting 
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e, Consensuses of formyltransferase sequences 
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Extended Data Figure 8 | Conservation and variation of residues 
involved in the interaction interfaces. a, b, Sequence logos made using 
the WebLogo server (http://weblogo.berkeley.edu)*” show conservation 
and variation as found in multiple sequence alignments of F domain 
residues that interact with the A domain (a) and A domain residues that 
interact with the F domain (b). Below each logo are the corresponding 
residues in the LgrA proteins from the five Brevibacillus species, with the 
crystallized LgrA on the first line. FT, formyltransferase. c, d, Sequence 
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logos indicate the conservation and variation in F domain residues 

involved in binding and interaction with PCP-PPE-Val across the closest 

240 homologues of LgrA (c) and all of the functionally or structurally 

characterized formyltransferase proteins (d) (reduced for redundancy 

so that no two sequences have >50% sequence identity). e, Consensus 

sequences for the five Brevibacillus LgrA homologues and for the 

formyltransferases of known structure for each of three formyltransferase 

types. Catalytic residues are His73, Asn71 and Asp108. 
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Extended Data Figure 9 | Interaction surfaces in PCP and A,yp domains. 
a, b, The Agup (a) and PCP*?>*4 (b) domains must maximize the use of 
their limited surfaces to interact with their many binding partners. Shown 
are the surfaces observed in this study, and many excellent previous studies 
have also documented interaction surfaces biochemically or structurally. 
This includes, for example, the equivalent of PCP domain residues Met249, 
Phe264 and Ala268, which are required for interaction with the C domain 
in the acceptor site*® and form hydrophobic interactions with the 

C domain” in a very similar manner and using an overlapping surface, 

as the PCP domain does to interact with the F domain. Furthermore, 
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b PCP domain of LgrA 


A (thiolation) 


F (formylation) 


PPE arm 
attachment 


A. (formylation) 


A guy (thiolation) 


partially overlapping surfaces in PCP domains have been proposed to 
interact with their (acyl-)PPE arm to protect thioester intermediates”® 
or to promote binding to the appropriate partner domain*’. These 
interactions might occur during PCP domain transit, but they would 
have to be broken before productive binding to partner domains. Several 
of these PPE interactions are incompatible with the productive domain- 
domain interactions*’, and in catalytic configurations seen here and 
previously, the PPE arms extend into the partner domain and make little 
contact with the PCP domain. 
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Extended Data Table 1 | Crystallographic statistics 


Data collection 
Space group 
Cell dimensions 

a, b,c (A) 

a, By (°) 
Resolution (A) 
Fmerge 
Vol 
Completeness (%) 
Redundancy 


Refinement 

Resolution (A) 

No. reflections 

Pworks Free 

No. atoms 
Protein 
Ligand/ion 
Water 

B-factors 
Protein 
Ligand/ion 
Water 

R.m.s deviations 
Bond lengths (A) 
Bond angles (°) 


FA 
P4,2,2 


161.3, 161.3, 138.2 
90.0, 90.0, 90.0 
87.97-2.46 (2.52-2.46) 
0.097 (1.521) 

12.9 (1.4) 

100 (100) 

9.7 (8.4) 


63.98-2.46 (2.50-2.46) 
66619 
0.205/0.240 


*Highest resolution shell is shown in parentheses. 
tOne crystal was used for each structure. 


F-A-PCP 
R3:H 


278.7, 278.7, 82.8 
90.0, 90.0, 120.0 
80.44-2.80 (2.88-2.80) 
0.072 (1.13) 

10.2 (1.5) 

100 (100) 

3.9 (3.9) 


46.05 -2.80 (2.84-2.80) 
59106 
0.237/0.268 


10862 
0 
71 


106.55 
N/A 
65.00 


0.006 
0.828 
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F-A-PCP-PPE-NH-Val 
P2, 


77.9, 101.2, 139.6 
90.0, 91.1, 90.0 
81.97-2.60 (2.66-2.60) 
0.173 (1.64) 

5.5 (1.4) 

100 (99.9) 

3.8 (3.8) 


47.48-2.55 (2.58-2.55) 
65091 
0.241/0.289 


F-A-PCP-PPE 
P3.2 


162.1, 162.1, 208.9 
90.0, 90.0, 120.0 
83.73-3.80 (4.01-3.80) 
0.110 (2.13) 

10.8 (1.3) 

100 (100) 

11.2 (11.3) 


48.45-3.77 (3.88-3.77) 
32124 
0.294/0.317 


11348 


F-A soak 
P4,2,2 


160.8, 160.8, 137.6 
90.0, 90.0, 90.0 
87.66-2.80 (2.90-2.80) 
0.127 (1.88) 

10.3 (0.9) 

100 (100) 

14.7 (14.9) 


49.98-2.81 (2.87-2.81) 
61365 
0.256/0.224 
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Kyle Larson and his team trek over a mountain pass in Nepal, one of many adventurous expeditions that have allowed him to take his research outdoors. 


Extreme research 


Cavers, divers and climbers take their science to strange and wonderful places. 


BY EMILY SOHN 


arina Elliott never planned to apply 
Me= outdoor adventure skills to a 
career in research and exploration. 
But in October 2013, she saw an advertise- 
ment for a project in South Africa that called 
for cave explorers with archaeological experi- 
ence who were also small enough to squeeze 
through a narrow passageway to excavate an 
underground chamber. She was startled by 
how perfectly qualified she was for the job. 
Already an avid rock climber and spelunker, 


Elliott was then finishing up a PhD in biological 
anthropology at Simon Fraser University in 
Vancouver, Canada. She had worked on exca- 
vations in remote places, including Siberia and 
northern Alaska. And she had the flexibility to 
drop everything to spend a month in Africa. 
She joined a team of five other women. One 
by one, they shimmied through a 12-metre-long 
chute with an 18-centimetre-wide pinch point, 
and they emerged with more than 1,500 fos- 
sils from 15 skeletons of a previously unknown 
species of ancient hominin called Homo naledi. 
The discovery helped her to land her current 


postdoc position in biological anthropology at 
the University of the Witwatersrand in Johan- 
nesburg, and she now leads a team of six cavers 
who continue to explore the region. 

Although Elliott’s path required a dose of 
serendipity, her experience illustrates one of 
the many ways that scientists can combine a 
love for outdoor adventure with their career. 
Researchers who pursue extreme fieldwork say 
that the discoveries they make along the way 
provide a lifetime of adventure tales and shape 
their careers in positive ways. 

Combining an extracurricular passion > 
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> for the outdoors with a high-stakes career, 
however, also brings complications, including 
the risk that nothing will go as planned. When 
preparing for fieldwork, ‘adventure’ research- 
ers need to be particularly careful with logistics 
to ensure success — and their own survival. In 
addition, they often need to acquire specialized 
insurance and build a safety net of teammates 
and strategies to deal with inevitable obstacles. 
Even with the best-laid plans, disaster can still 
come in many forms — from violent weather 
and political strife to crippling injuries. Flex- 
ibility and quick thinking can be the difference 
between a productive trip and a waste of time 
— or worse. 


RISK MANAGEMENT 
Stacy Kim, a marine ecologist at Moss Landing 
Marine Laboratories in California, regularly 
dives beneath the ice in Antarctica to study how 
human pollution affects life on the sea floor. On 
one trip, she spun out on a snowmobile and dis- 
located her shoulder. The weather was bad, and 
the remoteness of the location meant that it took 
several days for medical personnel to reach her 
by helicopter. For the rest of the trip, she was 
stuck on top of the ice, doing lab work instead 
of going underwater. Her teammates did the 
diving instead. “You try to make sure no single 
person is completely irreplaceable,” she says. 
When injuries and other obstacles occur, 
they can prematurely end expeditions — but 
stopping early is not always an option when a 
research agenda is involved. So, like Kim, many 
scientists who work in extreme locations try to 
factor in more time, gear and logistical support 
than they would for trips done purely for fun. 


Ecologist Catherine Cardelus of Colgate 
University in Hamilton, New York, does most 
of her field work in the tree canopies, a task that 
requires climbing up ropes while battling jun- 
gle heat and fending off biting insects. On each 
climb, she lugs a heavy pack filled with sample- 
collecting tags and bags, tape measures, note- 
books, walkie-talkies, water, lunch and other 
supplies for days of work that can keep her in 
the trees for up to seven hours at a time. 

Cardelus says that 
her field season lasts 


; “You constantly 
two or three times have to be open 
longer than those  t» the possibility 
of scientists whose ) 

that you can’t 
research occurs on 
do what you 


the ground. Some 
days of work are inev- 
itably lost because of 
rain or wind, so she always makes sure to hit 
the trees immediately on good days. But because 
it is so exhausting, nobody is allowed to do work 
in the canopy for more than three days in a row. 

“You have to be incredibly flexible and for- 
giving,’ says Cardelus. “You wake up in the 
morning and there's a shut-out rain, so you 
have to say, ‘Oh well, let’s punch in some data. 
Let's do lab work’ You constantly have to be 
open to the possibility that you can’t do what 
you need to do.” 

It can help to consider unplanned diversions 
as opportunities, says Kyle Larson, a structural 
geologist at the University of British Colum- 
bias Okanagan campus in Kelowna. In 2014 
on a trip to Nepal, Larson and his team were 
stopped by snow at an elevation of 3,000 metres 
— only halfway to their planned destination. 


need to do.” 


JOYS OF DISCOVERY 


Views from the other side 


Ecologist Douglas Larson, emeritus 
professor at the University of Guelph in 
Ontario, had just begun to belay a colleague 
over a cliff in Ontario when some weasels 
emerged from a nearby hole. As the 
colleague descended out of sight, a mother 
ermine and her 13 babies ran over Larson. 
“All | could do was scream out, “‘l’m being 
swarmed by ermine!” he says. “It was one 
of those exquisite experiences that happens 
and disappears again.” 

Despite its dangers and frustrations, 
extreme fieldwork can produce incredible, 
once-in-a-lifetime memories for scientists 
who put themselves out there. Kyle Larson 
(no relation to Douglas), a structural geologist 
at the University of British Columbia in 
Kelowna, once brought his father along 
as a field assistant on an expedition to the 
Himalayas. They hiked ahead of their porters, 
and after topping a 5,000-metre-tall peak, 
they stopped in a small settlement. Snow was 


falling, and they were getting chilled. Then an 
elderly Tibetan woman poked her head out 
of a house and beckoned them inside. She 
gave them salty yak-butter tea. And although 
they could not communicate much, they 
sat around the kitchen’s cooking fire for the 
next hour, sharing the warmth until the rest 
of the team appeared. “Those things happen 
on every trip,’ Larson says. “From a cultural 
perspective, it becomes very enlightening.” 
Best of all, say adventurous researchers, 
are the rewards that come with pushing the 
limits of discovery. Dangling from cliff faces 
in Canada and France, Doug Larson has 
discovered tiny, 1,000-year old trees. Up in 
the treetops, ecologist Catherine Cardelus, 
an associate professor at Colgate University 
in Hamilton, New York, always marvels at the 
perspective that she gains when she doesn’t 
have to keep her eyes glued on the trail for 
snakes. “Oh my gosh,” she says. “Every day | 
see things most people don’t see.” E.S. 
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They collected what samples they could, but 
they were unable to gather data for a project 
of a student on the trek. She had to abandon 
the Nepal study and later did fieldwork in Sas- 
katchewan instead, which led to some signifi- 
cant findings. “Ifyou end up going to places 
you didn’t expect to go,” Larson says, “you can 
discover things you didn't expect to discover.’ 

Safety is also a major concern when working 
on the edge, and adventurous scientists rec- 
ommend erring on the side of caution, both to 
protect team members and to sustain funding. 
After all, it can take extra effort to persuade 
a granting agency to give money to support a 
dangerous, lengthy or team-heavy expedition 
that may not go as intended. 

Early in his career, ecologist Douglas 
Larson (no relation to Kyle), now an emeritus 
professor at the University of Guelph, Canada, 
included climbing ropes in a purchase order so 
that he could rappel down cliffs in the Niagara 
Escarpment and study what turned out to be 
extremely ancient trees living there — a dis- 
covery so surprising that forest ecologists in 
the region disparaged the results before they 
saw samples. 

The university's director of safety and security 
told Larson that even one accident would shut 
his project down. From then on, he committed 
to extra precautions and redundancies in his 
equipment, so that even if someone hada heart 
attack, he or she could be pulled up. Instead of 
having two or three points of webbing attached 
to the anchors at a time, as do most recreational 
climbers, he and his group use four or five. 


EXTREME BACK-UP 

Strong safety nets become essential in this 
line of work. Elliott's dig in South Africa had a 
medic on standby 24 hours a day. They noti- 
fied several groups about their plans, including 
the South African military and a mining- 
rescue organization, so that they could get help 
quickly in case of a confined-space emergency. 
Many adventure-researchers buy specialized 
insurance that can provide rescue help in 
remote locations or that can cover accidents 
related to their work. 

But no insurance company can charter a 
helicopter-rescue mission to a place such as 
K2, the second-tallest mountain in the world, 
where geologist Mike Searle of the UK’s Uni- 
versity of Oxford has conducted research. He 
recommends building relationships with the 
locals in remote regions, and he always gets to 
know his porters. “If you're in trouble, those 
are the guys who are going to carry you down” 

It also helps to be in shape, which is busi- 
ness as usual for many researchers who scale 
glaciers, climb mountains or dive to the ocean 
floor in frigid water. To prevent injury and to 
stay flexible and strong for caving, Elliott runs, 
hikes and takes exercise classes that combine 
ballet, Pilates and gymnastics. Kyle Larson lifts 
weights six days a week when he is not trek- 
king. Kim freedives for fun, often down to 


LAURENT GODIN 


Kyle Larson ona research trip to Nepal. 


18 metres or so. And Searle, now aged 
61, bikes to work every day. He also climbs, 
swims and surfs. “You can’t climb moun- 
tains, he says, “if you're a couch potato.” 

One summer day in 1998, Cardelus was 
dangling from a rope some 24 metres above 
ground, near the top of a tree in the Costa 
Rican jungle, when two howler monkeys 
began to make aggressive motions. Crouched 
about three metres away, they were shaking 
branches and baring their teeth with arms 
outstretched, ready to leap. “I thought, ‘Oh 
my God, here they come;” she says. Then 
she heard “an ancestral guttural sound” — 
not from the two monkeys, but from her 
husband, who was working nearby. The 
monkeys scattered. 

Cardelus — who no longer climbs when 
monkeys are nearby — has experienced 
many such close calls that include run-ins with 
snakes, ants, bees and tarantulas (see “Views 
from the other side’). “Each time you climb a 
tree for the first time, you have to be prepared 
to evacuate within 15 seconds,’ she says. “It's 
always exciting getting into the canopy. And 
it's just as euphoric to get to the ground” 

Wildlife is not the only source of heart- 
thumping adventure. One afternoon in the 
spring of 2011, Kyle Larson crested a moun- 
tain pass in Nepal to discover a steep, nearly 
sheer descent buried in waist-deep snow. The 
team could see neither the trail nor what they 
were stepping on. Last year, he arrived in the 
country’s Makalu region on the heels of a 


busy storm season that had dumped metres 
of snow on the region. Piles of snow reached 
the rooftops, and trekking was treacher- 
ous. “Trying to walk down through that was 
scary, he says. “There were lots of bruised 
knees and falls” 

A certain level of psychological prepa- 
ration is crucial for working in extreme 
environments that are, by nature, full of sur- 
prises. And that process often starts before 
the expedition begins. For Elliott, the idea of 
squeezing through an extremely tiny space 
presented the first mental obstacle. When 
Lee Berger, the palaeoanthropologist who 
recruited Elliott and her fellow cavers for the 
South Africa excavation, told them that they 
would need to fit through a small gap, “all of 
us ran around our houses measuring furni- 
ture and stuffing ourselves under it’, she says. 
She could wedge herself into the space just by 
expanding her lungs. 

As she applied, Elliott worried that she 
wasn't qualified enough or that she had 
screwed up the Skype interview. She contin- 
ued to doubt herself even after arriving on site. 
On the first day of reconnaissance, she looked 
into the 12-metre-long vertical chute that the 
team was to descend. If someone were to get 
hurt, medics would have to tend to the injured 
person until she healed enough to get out on 
her own. “Psychologically, that was quite try- 
ing,” she says. “I remember looking down this 
shark’s maw of rock, and you can’t see where 
youre going because it’s nota straight line, and 
thinking, ‘Oh, gee, perhaps I’ve miscalculated 
my own skill set.” 

Because academic courses typically do not 
cover the ins and outs of survival, Elliott relied 
instead on years of hands-on experience and 
prior training that had taught her to remain 
calm enough to deal with unexpected circum- 
stances. Long before she took on the caving 
job in South Africa, she had worked as a field 
guide for an adventure-tour company in the 
Rocky Mountains and earned a certificate in 
wilderness first aid. Both equipped her with 
survival and decision-making skills. 

Itis impossible to predict every emergency, 
she says. But one can learn to think quickly 
and clearly in any situation. “What you can 
prepare for,” she says, “is the mental stability 
to say, ‘OK, what do we need to do next? Who 
needs to do it?” 

Elliott also advises young researchers to 
pursue all of their life passions, even if they 
seem completely unrelated. She started out 
studying veterinary medicine before earning 
a PhD in anthropology and landing the career- 
changing excavation post in South Africa. “My 
take-home message is, don't fuss if your career 
or life path appears to be a little bit circuitous,” 
she says. “You never really know where any 
given skill set or experience might lead you.” m 


Emily Sohn is a freelance journalist in 
Minneapolis, Minnesota. 
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TRADE TALK 
Science integrator 


Lana Gent is a 
director of science at 
the American Heart 
Association in Dallas, 
Texas, where she 
coordinates networks 
of volunteers, the 
drafting of science 
guidelines for 
emergency medicine 
and the production of instructional videos 
on first aid. 


What do you do? 

I help to gather input from resuscitation 
scientists around the world who evaluate 
the scientific evidence that goes into the 
creation of our resuscitation guidelines. 
I’m not usually the one who is giving the 
talk or is the first author, but it is my team 
that ensures that those experts are able to do 
the presentation or create the publication. 


How did you learn about this job? 

Our lab was in a crunch for money and that 
made me think about what I wanted to do. 
Do I continue on this pathway of being a 
traditional lab scientist? Colleagues were 
encouraging me to be a medical-science 
liaison — a professional who teaches physi- 
cians how new medicines work — and I was 
going to interviews. During that process, I 
was contacted by a recruiter representing the 
American Heart Association. I didn’t thinkI 
was the best fit; I didn’t have a resuscitation- 
science background or management expe- 
rience, but the recruiters knew that I had 
transferable skills. I convinced everyone that 
ifI could learn stereotactic brain surgery in 
mice, I could learn resuscitation science. 


Why is this job right for you? 

No one goes to school for the type of posi- 
tion I have. You wear a lot of hats. The 
hiring manager could see that I took initia- 
tive and was passionate. I had shown that 
I could take on new challenges and bring 
people together. 


What has the job taught you? 

[had to learn to be resilient and inquisitive 
and not walk along just one path. Some- 
times as scientists we pride ourselves on 
being contemplative, and the greatest skill 
set here is to simplify information and to be 
quick on your feet. m 


INTERVIEW BY MONYA BAKER 
This interview has been edited for length and 
clarity. For more, see go.nature.com/iyssud. 
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PROJECT DAFFODIL 


BY SYLVIA SPRUCK WRIGLEY 


he Daffodil Project is a joint 
"Tecnu between the United 
Nations, ESA and Friends 
of the Elderly. I am their 17th civil- 
ian recruit. For obvious reasons, the 
entire project is extremely hush-hush. 

My daughter hates the Project. 

“Maman,” she said. Amélie visits 
once a week. “Maman, let me take 
you out of here. There’s no mission.” 
I thought wed had our funding cut. 
Then I saw the look her son Daniel 
gave her. Poor Amélie. She doesn't 
think I should be considering 
space travel, not at my age. Every 
visit she’s tried to tell me that the 
Daffodil Project doesn’t exist. 
We've had this conversation so 
many times. 

“What’s the harm,” Daniel 
whispered, like I can’t hear. At 
least he understands I’m not abandon- 
ing them, or if] am, well... funny, that my 
grandson would be the one to understand. 

“Tell me again, grand-mére, how are they 
going to get all of you to Mars? Will you get 
a window seat? Can you send me pictures?” 

Amélie scowled. She's always hid her fear 
under a face full of anger. She can't bear the 
thought of my leaving her, even though she 
knows it will happen someday. I wish I could 
comfort her like I did when she was a little 
girl scared of thunder. She'll be 50 in Febru- 
ary but she’s still my Mélie. 

“T don't think the space-transport will 
have windows,” I told Daniel. “It’s not a 
pleasure cruise. But once we arrive, we'll be 
able to see Mars and get instructions from 
Earth. We have some sort of data transfer 
over satellite for systems monitoring. I bet 
we'll even have voice data. I'll have to ask 
Jacques...” 

The nurse walking past smiled. “Aren't 
you the clever one, Madame.” 

That was when Amélie stormed out, 
threatening to shut the place down as she 
dragged her son out the door. I didn’t mind. 
I could see Claudine setting up the card table 
for a game of belote. 

These are my friends now. Paul spread 
chocolates on the card table, a gift from his 
granddaughter. Awful 
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enough. We're trying to emulate life on the 
space station so the food is not very good, all 
government approved. The thing is, when 
we get up there, there's no coming back. So 
if an addiction to croissants or cognac is 
going to break your heart, youd better find 
out now. 

Or if leaving for Mars is going to break 
your daughter's heart. 

Claudine was dealer, as always. She's 83, 
the eldest. She’s also the best belote player, 
although Jacques swears she cheats. 

There's two dozen of us here in the home, 
if I don’t count Monsieur Coulon, which 
I don't because he’s really just a bundle of 
robes and drool. I guess they keep him with 
us as a test, to see how we cope when some- 
one loses it completely but doesn't die. The 
nurses said not everyone can go to space. So 
maybe they’re sending a dozen, or less? 

We're all very well behaved because we 
want to get chosen. It’s true, Mars’s surface 
is too cold for space colonies. But heating up 
a planet is not a problem; it just takes time. 
It’s on the television every evening how we've 
overheated this one, after all. Jacques and I 
have worked out the super-secret plan: send 
a small team to Mars to oversee operations. 
Repeat as necessary until the planet is inhab- 
itable and then invite the masses. Voila: new 
homeland. 

Amélie thinks I'm deluded. She half expects 
me to transfer my life's savings to a con-man 
or something. But honestly, my life's savings 
are a €800-per-month pension and the house 
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that Antoine and I bought in 1972. 
Really, what do I have to lose? 

What Amélie doesn’t understand is 
that we aren't colonizing. We're Project 
Daffodil, the first inhabitants of chilly 
Mars. Jacques and I, we've worked it all 
out. They can't afford to keep ferrying 
people back and forth. But we've got 
some time left, five years, maybe ten. 
Our job is to keep an eye on things and 
wait to die. 

If you want to save money with a 
one-way trip, it makes perfect sense 
to recruit from a nursing home, now 
doesn't it? 

This is why we practise sur- 
vival skills. We can’t live on the 
surface, obviously, so we'll have 
adome. It’s going to be small and 
uncomfortable, like our shared 
rooms here: just enough space 
for two beds and a cupboard, no 

chance of privacy. 

If it is just four of us, I hope I get to go with 
Claudine and Paul and Jacques. We play 
belote every afternoon and tell stories and 
have a grand time when we can convince 
one of the grandchildren to bring us pastis. 
I guess they won't give us pastis up there, but 
we'll take the cards. We'll get by. 

Amélie keeps making up excuses to try 
to get me out of here. She’s frantic for me to 
come and live with her now, leave the home 
and the Project. 

I told her that I'll think about it, but 
Jacques thinks they are confirming the final 
crew soon. Maybe even tonight. 

Since Antoine died, every morning I wake 
up and remember that I'm decrepit. That he 
abandoned me to live my old age alone. Every 
morning I remember that there's no future, 
my entire life is in the past. So, what do I do? 

I could deaden it with pastis until some- 
one carts me away. 

Or I could go to Mars. 

Tell Amélie that I will miss her. m 
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